Write entries to a MongoDB database using a microcontroller and read them back using a Python script on your computer.
This assignment consists of two parts:
- Using a microcontroller to run experiments and upload data (
insert.py
) - Using your computer to retrieve and analyze the data (
find.py
)
For this assignment, you will need to add the GitHub repository secrets listed below both as GitHub Actions secrets and Codespaces secrets, so that both you and the autograding scripts can access them. Please also keep a backup copy of these secrets in a secure location.
The secrets to be added to GitHub Actions and Codespaces are:
Variable Name | Description |
---|---|
COURSE_ID |
Your student identifier for the course |
DATABASE_NAME |
The name of your database within the cluster |
COLLECTION_NAME |
The name of the collection from your database |
LAMBDA_FUNCTION_URL |
The Lambda Function URL that acts as an intermediate layer between the microcontroller and the orchestrator |
ATLAS_URI |
The MongoDB Atlas universal resource identifier (URI) for the Python orchestrator to connect to MongoDB (sometimes referred to as a connection string) |
While tedious, this approach is a one-time setup that simplifies the development and autograding process while adhering to security best practices. Your COURSE_ID
can be accessed from the main course website by referencing the corresponding quiz response from the main course website. Instructions based on step 13a from the Build Instructions manuscript for each of the remaining secrets and instructions on how to add secrets are provided in the following sections.
First, set up your MongoDB account and create a free-tier M0 cluster. Choose a cluster name* such as test-cluster
and take note of the name (CLUSTER_NAME
).
*CLUSTER_NAME
should be 23 characters or fewer and should not end with a hyphen -
(based on Atlas docs)
Follow the Creating a MongoDB Database with the Atlas UI
instructions within the Create a Database guide. Take note of the database name (DATABASE_NAME
) and collection name (COLLECTION_NAME
) that you choose: for example, test-db
and test-collection
, respectively.
Next, create a database user. Assign a specific privilege with the readWrite
for the database you created (e.g., test-db
) and Restrict Access to Specific Clusters for your current cluster (e.g., test-cluster
), as shown below. Store the username and password in a secure location. Remember to click the green Add User button at the bottom-right.
You will do this in place of assigning a built-in role, which typically would allow you access to any database in any cluster. NOTE: There are many ways you can customize access including built-in roles and privileges and custom database roles.
Add 0.0.0.0/0
to the network configuration to allow access from anywhere.
You will also need to copy the MongoDB Atlas URI for your cluster. Refer to the video below. IMPORTANT: After copying the Atlas URI (connection string), you will need to manually populate <db_username>
and <db_password>
with your database user credentials before adding this as a GitHub secret. For example, if the copy-pasted Atlas URI is mongodb+srv://<db_username>:<db_password>@test-cluster.ab123.mongodb.net/?retryWrites=true&w=majority
and your MongoDB username and password are sgbaird
and HGzZNsQ3vBLKrXXF
, respectively, then your ATLAS_URI
would be mongodb+srv://sgbaird:[email protected]/?retryWrites=true&w=majority
.
NOTE: The instructions for this section are based on the Lambda getting started guide. There are more advanced ways to set up Lambda functions, but the following steps and level of complexity are tailored to our simple use-case.
First, create an AWS account (you will need a credit card, but the free tier should be sufficient). Next, you'll create a new Lambda function:
- Open the Functions page of the Lambda console.
- Choose Create function.
- Select Author from scratch.
- In the Basic information pane, for Function name, enter a name (e.g.,
dataLogging
). - For Runtime, choose Python 3.13.
- Click the dropdown for Additional Configurations
- Check the Enable function URL box
- Under Auth type, select None
- Check the **Configure cross-origin resource sharing (CORS) box
Your settings should look like the following, with everything else left as default:
- Finally, click Create function.
Open and run the Preparing the Python code and dependencies for AWS Lambda Colab notebook, which will allow you to save a deployment.zip
file to your local machine. If you have trouble accessing or running the notebook, there is a backup copy [permalink]. However, the notebook has the latest version and is instructive for how to use Lambda functions with Python dependencies.
Back on the Lambda function page, upload the deployment.zip
file by clicking on the Code tab and selecting Upload from and .zip file, then choosing the deployment.zip
file from your machine. Click Save.
In the code editor within lambda_function.py
, you will add your ATLAS_URI
as a Lambda environment variable (for a similar reason to why you added it as a secret to the GitHub repository). Navigate to the Configuration tab. Along the left sidebar, click Environment variables, then Edit.
Add your Atlas URI as follows, ensuring that your database username and password are embedded correctly. Click Save.
Note the Function URL from the top right of the page. This is your LAMBDA_FUNCTION_URL
.
The interfaces for adding secrets can be found via your GitHub repository's Settings
tab under the Security
section on the left sidebar. Click the corresponding dropdown item in the Secrets and variables
dropdown menu. Alternatively, use the direct links to navigate to these pages, which will be of the form:
https://github.com/ACC-HelloWorld/5-data-logging-<username>/settings/secrets/actions
and
https://github.com/ACC-HelloWorld/5-data-logging-<username>/settings/secrets/codespaces
where <username>
is replaced with your GitHub username. For example:
https://github.com/ACC-HelloWorld/5-data-logging-sgbaird/settings/secrets/actions
and
https://github.com/ACC-HelloWorld/5-data-logging-sgbaird/settings/secrets/codespaces
Update insert.py
to iteratively run color experiments and insert the data into your MongoDB database. You will need to:
- Connect to MongoDB using the provided URI
- Run each experiment in
payload_dicts
using therun_color_experiment()
function - Insert the results into the database
The document structure should be of the form:
{
"command": {"R": ..., "G": ..., "B": ...},
"sensor_data": {"ch410": ..., "ch440": ..., ..., "ch670": ...},
"experiment_id": "...",
"course_id": "...",
"timestamp": "..."
}
Update find.py
to read all records from the database that have your course ID stored within the course_id
field. Then, convert the results into a pandas DataFrame and save the output into a CSV file called results.csv
.
Since your documents are nested dictionaries, you will need to use the pd.json_normalize
function to flatten the nested dictionaries into a pandas DataFrame. For convenience, you can also set the _id
column as the index of the pandas DataFrame via .set_index("_id")
. See the example below:
import pandas as pd
data = [
{
"id": "id1",
"category1": {"item1": 1, "item2": 2, "item3": 3},
"category2": {"itemA": 10, "itemB": 20, "itemC": 30},
},
{
"id": "id2",
"category1": {"item1": 4, "item2": 5, "item3": 6},
"category2": {"itemA": 40, "itemB": 50, "itemC": 60},
},
]
df = pd.json_normalize(data).set_index("id")
print(list(df.columns))
['category1.item1', 'category1.item2', 'category1.item3', 'category2.itemA', 'category2.itemB', 'category2.itemC']
print(df)
# category1.item1 category1.item2 category1.item3 category2.itemA category2.itemB category2.itemC
# _id
# id1 1 2 3 10 20 30
# id2 4 5 6 40 50 60
NOTE: Committing and pushing results.csv
is optional and will
not affect the autograding results.
See postCreateCommand
from devcontainer.json
.
pytest
You can also use the "Testing" sidebar extension to easily run individual tests.
- IP Access List docs ("Atlas UI" instructions)
- Create a Database guide.
Before, we mentioned that we strategically selected a minimal implementation of AWS Lambda functions. To elaborate, we chose to use a .zip
file for the Python dependencies rather than an AWS Lambda Layer. The latter is a more advanced feature that allows you to share code across multiple functions. We chose to use the Lambda Function URL directly rather than an API Gateway because it is simpler and more direct for our use-case. We also chose to allow the Lambda Function URL to be accessible by any machine (CORS) without authentication (other than knowing the URL itself). Finally, we chose to add our ATLAS_URI
as an environment variable within the Lambda function rather than using AWS Secrets Manager secrets in AWS Lambda functions. The goal of this assignment is not to make you an expert at AWS Lambda, but rather to enable you to send data from your microcontroller directly to a database without overcomplicating things. If your usage of AWS Lambda grows, you may want to consider these more advanced features.
For some additional resources related to AWS Lambda: