How to create an ETL process with Boomi, using AWS S3 and Salesforce

published on 09 April 2024

Extract, Transform, and Load (ETL) is a fundamental data processing technique to prepare data for analysis. It involves gathering data from various sources (extract), cleaning and converting it into a consistent format (transform), and then loading it into a central repository, where it's ready for tasks like business intelligence and machine learning.

These processes are the backbone of data integration, facilitating the movement of information between systems with efficiency and accuracy.  

This guide simplifies data integration between AWS S3 and Salesforce using Boomi. Learn to extract, transform, and load data, gaining valuable insights for decision-making.  No matter your Boomi experience, this tutorial equips you to build a seamless data pipeline.

TL;DR

To build a simple ETL process that collects files from your AWS S3 bucket, checks if the files were last modified today, and stores them in Salesforce with Boomi, follow these steps:

  1. Create a new Boomi integration process
  2. Collecting the data from the AWS S3 bucket
  3. Use the Business rule to figure out files for the last modified date of today
  4. Collect the files that were modified today from the S3 bucket
  5. Change file type from CSV to XML
  6. Adding a Salesforce connector to store the data
  7. Mapping the CSV values to the Salesforce fields
  8. Testing the Boomi process

Prerequisites

To get the most out of this article, you must have the following:

Creating files in your AWS S3 bucket

Before uploading files to your S3 bucket, you must create the CSV files you want to store. To accomplish this, create a new directory on your laptop using the following command:

mkdir <name of your directory>

Next, open this directory in your code editor and create as many CSV files as possible. This tutorial creates three CSV files: booking-1.csv, booking-2.csv, and booking-3.csv.

Populate these files with the CSV data you want to store. For instance, booking-1.csv can have this data, with each field separated by a comma:

booking_id,customer_name,customer_email,customer_company
123,Alice Johnson,alice.johnson@hospitality.com,Grand View Hotel
456,Brian Jones,brian.jones@travelagency.net,Adventure Tours

Populate the rest of the booking files with the data you want to store.

Uploading these files to your AWS S3 bucket

To upload your CSV files to an S3 bucket, search for S3 in the search bar and then click on the S3 service in the modal.

The AWS services modal
The AWS services modal

Clicking on the S3 service redirects you to the Amazon S3 page. To create a new S3 bucket, click this page's Create bucket button.

The AWS S3 bucket page
The AWS S3 bucket page

The Create bucket button leads you to the General configuration page of the S3 bucket you're about to create. Enter your desired Bucket Name, then click the Create bucket button to complete the creation process.

The AWS S3 create bucket page
The AWS S3 create bucket page
The AWS S3 create bucket button
The AWS S3 create bucket button

When you are done, you should see your newly created bucket on your Amazon S3 page.

Copy your AWS access and secret key

The AWS access and secret key allow applications or users to access the data in your S3 bucket.

To retrieve these keys, you'll need to create a user and specify which artifacts the user can access.

To create a user, search for and click on the IAM service. Then, open the Users page by clicking the Users tab in the sidebar.

The AWS services modal
The AWS services modal
The IAM dashboard, with an arrow pointing to the Users tab
The IAM dashboard, with an arrow pointing to the Users tab

On this page, click the Create User button. This action will open the Specify user details page. Input the user name you want to create, then click Next to proceed to the Set Permissions page.

The Specify user details page
The Specify user details page

Choose the Attach Policies Direct option on this page, then search for the AmazonS3FullAccess permission policy and click on it.

The AWS set permissions page
The AWS set permissions page

Click the Next and Create User buttons to create your new user. You will then see your newly created user on your User page.

The AWS user Review and create page
The AWS user Review and create page

To find your AWS access and secret access key, open your newly created user page by selecting the user from the sidebar, then click the Security Credentials tab.  

The users page, with an arrow pointing to the Security credentials tab
The users page, with an arrow pointing to the Security credentials tab

Scroll down to the Access keys section, and click the Create access key button.

The Create access key button
The Create access key button

This action leads you to the Access key best practices and alternatives page. Choose the Third-party Service option on this page since you'll manage the data in your S3 bucket with Boomi.

After that, check the box, "I understand the above recommendation and want to proceed to create an access key," and click the Next button.

The Third-party Service option
The Third-party Service option

You can input a description tag for the access key if you'd like, but it's not required. Then, click the Create access key button.

After this step, you will be directed to the Retrieve access keys page. Make sure to copy both the Access and Secret access keys. These keys are necessary for connecting your Boomi process to the AWS S3 bucket.

The Retrieve access key page
The Retrieve access key page

Click on the Done button to return to your user page.

How to create the ETL process

This tutorial creates an ETL process that enables you to collect data stored on the AWS S3 bucket.  Next, you'll filter this data to return only the records modified today, transform them into XML format, and then store them in Salesforce. This process empowers the sales team and improves management's visibility and decision-making. 

To create this ETL process in Boomi, follow these steps:

1. Create a new Boomi integration process

To create this process, go to the Integration page, and click the Create New button, and choose Process from the dropdown menu.

The Boomi Create New button
The Boomi Create New button

This action will open the Start Shape sidebar. Choose the No Data radio button, indicating that the process will not receive external data, and then confirm by clicking the OK button.

The Boomi Start Shape sidebar
The Boomi Start Shape sidebar

2. Collecting the data from the AWS S3 bucket 

After initiating your new Boomi process, add your AWS S3 bucket, which contains the files you want to extract, transform, and load into the Salesforce application. However, it is important to incorporate error handling to catch any errors during connection.

Add the Try/Catch shape to your Boomi process to handle errors.

To add a shape to a Boomi process, click on the small box at the end of the dotted lines, leaving the previous shape (in this case, the Start shape). Then, search for the shape you want to add to the process (the Try/Catch shape).

The Try/Catch Shape
The Try/Catch Shape

Add the Amazon S3 REST connector to the Try path. This connector links your S3 bucket to the Boomi shape.  Next, click on the Amazon S3 REST connector to configure the shape. This action opens up the Connector Shape sidebar. 

The Amazon S3 REST shape
The Amazon S3 REST shape
The Amazon S3 REST connector shape sidebar
The Amazon S3 REST connector shape sidebar

In this sidebar, select QUERY as the Action from the dropdown. Then, click the + in the Connection input field to create a new AWS S3 connection.

The Amazon S3 REST connection input field
The Amazon S3 REST connection input field

This action opens a connection page that requires you to input credentials for your S3 bucket. These credentials include the Amazon AWS Access Key and your Amazon AWS Secret Key. They grant your Boomi process the necessary permissions to access the data in your buckets.

The Amazon S3 REST connection page
The Amazon S3 REST connection page

Next, click the Save and Close button to return to your Connector Shape sidebar.

After creating the connection, click the + button in the Operation input field to define how to interact with the S3 bucket.

The Amazon S3 REST operation input field
The Amazon S3 REST operation input field

This action opens up the Operation page. On this page, click the Import Operation button. 

The Amazon S3 REST operation
The Amazon S3 REST operation

This action opens up an Amazon S3 REST Operation Import modal. In this modal, fill in the required information:

  • Atom: The atom you are running your Boomi process in
  • Connection: The Amazon S3 REST connection you just created
The Amazon S3 REST Operation Import modal
The Amazon S3 REST Operation Import modal

Click the Next button to open the Amazon S3 REST Operation Import modal. In this modal, choose the AWS S3 bucket from which you want to retrieve the files, then click Next.

The Amazon S3 REST Operation Import modal with the preferred S3 bucket option
The Amazon S3 REST Operation Import modal with the preferred S3 bucket option

Next, click the Finish button to save your newly loaded operation.

The Amazon S3 REST Operation Import modal with the loaded operation configuration 
The Amazon S3 REST Operation Import modal with the loaded operation configuration 

You should see your S3 bucket in the operation page's Objects section. Click on this bucket.

Next, you'll define a filter criteria for the Prefix field corresponding to the S3 bucket file path. To do this, click the downward arrow next to Filter. This will open up an Add Logical Sub-Group menu; select this option.

The Amazon S3 REST Operation objects section
The Amazon S3 REST Operation objects section

Next, click the downward-facing arrow and then on the Add Expression menu. This action opens up a form. In this form, input a filter name of Prefix, click the field search bar, and select the prefix menu. 

The Amazon S3 REST Operation prefix field
The Amazon S3 REST Operation prefix field

Finally, click the Save and Close button to return to the Amazon S3 sidebar.

Next, go to the Parameters tab to specify that you want to query all the files in the AWS S3 bucket with the static value 'bookings'. To do that, click the + sign in this tab to open the Parameter value modal. Fill out this modal with the following information:

  • Input -  Prefix
  • Type -Static
  • Static Value - bookings
The Amazon S3 REST Connector Parameter Value 
The Amazon S3 REST Connector Parameter Value 

Click the OK button to view your newly created parameter in your AWS S3 connector sidebar.

Screenshot-2024-04-07-175058-xyzgj

Click the OK button in the Connector sidebar to return to the canvas.

After connecting your Boomi process to your S3 bucket, add a Stop shape at the end of the catch branch to stop the process if any errors are encountered.

The Stop shape added to the catch path of the process
The Stop shape added to the catch path of the process

3. Use the Business rule to figure out files for the last modified date of today

After connecting your process to your S3 bucket, add a Business Rules shape to filter the files in your bucket based on whether they were modified today.

The Boomi Business Rules Shape
The Boomi Business Rules Shape

Click on the Business Rules shape to open the Business Rules Shape Options modal. In this modal, select a Profile Type of XML. In the Profile input field, you will see the profile you created for the Amazon connector. Simply select that profile.

The Boomi Business Rules Shape Options modal
The Boomi Business Rules Shape Options modal

Next, click the OK button to open the Business Rules Shape modal. In this modal, click the Create a New Rule button to create a new business rule.

The Getting Started with Business Rules Shape
The Getting Started with Business Rules Shape

On this New Business Rule page, click the Add dropdown and select the Function tab. Selecting the Function tab opens up an Add a Function modal. 

The New Business Rule
The New Business Rule

In this modal, select a Category of Date, select the Get Current Date function to get the current date, and then click the OK -> OK buttons to save this selection.

The Business Rules Add a Function modal
The Business Rules Add a Function modal

Next, add the second input for the comparison; this input will be the file's last modified date.

To add this second input, click the Add dropdown, then select Field from the menu. This action will open an Add a Field modal. Click on the Field input area to open a New Input modal within this modal. Choose the LastModified field and click the OK -> OK button to save your selection.

The Business Rules New Input modal
The Business Rules New Input modal

Next, add a condition to compare these values and check if they are the same. This means checking if the last modified date equals the present date.

To create this condition, click the Add a Condition link.

The Add a condition link 
The Add a condition link 

In the dropdowns, select the following fields.

  • In the first dropdown, select the  LastModified(Contents/LastModified) field
  • In the second dropdown, select the = field
  • In the third dropdown, select the Get Current Date field
The New Business Rule modal 
The New Business Rule modal 

Next, select the Save button and click OK to save the business rules. 

When checking for the condition in the Business Rules shape, it can take one of two paths, Accepted or Rejected.  If the files are rejected, you'll want to end their journey. To do that, add the End and Continue shape to the Rejected path from the Business Rules shape.

You'll want to retrieve and store the remaining files that pass this condition in your Salesforce application.

4. Collect the files that were modified today from the S3 bucket

To retrieve the files that meet these conditions, you need to add another Amazon S3 REST connector to the accepted path from the Business Rules shape.

The AWS S3 GET Connector on the Accepted path of the Business Rules Shape
The AWS S3 GET Connector on the Accepted path of the Business Rules Shape

Next, click on the Amazon S3 REST shape to configure it. In the Connection input field, select the Amazon S3 REST Connection you created earlier. Then, choose the GET action and click the + button in the operation field to create a new operation.

The AWS S3 connector operation
The AWS S3 connector operation

This action opens up the Operation page. Click on the Import Operation, and fill out the modal with your credentials:

  • Atom: Select the Atom you want to run your Boomi process in 
  • Connection: The AWS S3 REST connection you just created.

Click the Next button to open the Amazon S3 REST Operation Import modal. In this modal, select the S3 bucket with the files you want to get, then click on the Next -> Finish buttons.

The Amazon S3 REST Operation Import with your preferred s3 bucket
The Amazon S3 REST Operation Import with your preferred s3 bucket
The Amazon S3 REST Operation Import with your loaded operation
The Amazon S3 REST Operation Import with your loaded operation

Click the Save and Close button to save your Amazon S3 operation and return to the sidebar.

Next, you pass in the file key as a parameter to the next AWS call, allowing you to read the content of the file key. To pass the file key as a parameter,  go to your sidebar, navigate to the Parameters section, and click the + button to create a new parameter. 

The Amazon S3 REST Connector Shape Parameter value
The Amazon S3 REST Connector Shape Parameter value

This action opens up a Parameter Value modal. In this modal, fill in the following information:

  • Input - Select the ID field in the New Input modal 
  • Type - Profile Element
  • Profile Type -  XML
  • Profile - Your newly created query Amazon S3 profile
  • Element - Key

Click on the OK button to save your parameter. Your newly created parameter should appear in the parameter section.

The Parameter value modal
The Parameter value modal

Then click on the OK button to save your Amazon S3 REST configuration. 

5. Change file type from CSV to XML

Next, you will transform the CSV files you get from the AWS S3 bucket to an XML format for more compatibility with your Salesforce application.

To change this CSV format, add a Map Shape to the Boomi process. Click on this shape to configure it, which opens up a Map sidebar.

The Boomi Map Shape
The Boomi Map Shape

Click the + sign in the Map input field in this sidebar to open the Map page.

The Boomi Map Shape input field
The Boomi Map Shape input field

Go to the Map source section on this page and select the Choose link. 

The Map source Choose link
The Map source Choose link

This action opens the Choose a Source Profile modal; in this modal, select the Profile Type of Flat File, and in the Profile field, select the CSV profile you created previously. 

The Map Choose a Source Profile modal
The Map Choose a Source Profile modal

Click the OK button to see your data fields populated in the Elements menu.

Next, Click the Save and Close button, as you will need to create a Salesforce profile before you can map the data from the CSV file to the Salesforce field.

6. Adding a Salesforce connector to store the data

Next, add a Salesforce connector to your Boomi process.

The Salesforce connector shape
The Salesforce connector shape

Click the Salesforce connector to open the sidebar and configure it. In the sidebar, fill out the input fields with the following information:

  • Connection: Salesforce connection allows you to connect your Boomi process to the Salesforce
  • Action: The action will be Send as you are upserting data into your Salesforce application instead of retrieving
  • Operation: This defines how to interact with your Salesforce account.
The Salesforce connector shape sidebar, filled in with the connector and operation
The Salesforce connector shape sidebar, filled in with the connector and operation

To learn how to create a Salesforce connection and Operation, check out How to sync your MySQL database with Salesforce in Docker using Boomi.

Click the OK button to save the configuration for the Salesforce connector Shape.

7. Mapping the CSV values to the Salesforce fields

To map your data to the fields in Salesforce, open up your CSV to XML map shape, and click the pencil icon to edit the Map shape.

On the right side of the page, the Map destination, find the Choose button and click it. This action will open the Choose a Profile modal.

The Map destination Choose link
The Map destination Choose link

Select XML from the Profile Type dropdown menu within the modal. Then, choose the Salesforce Profile you just created. Confirm your selection by clicking OK.

The Choose a Destination Profile
The Choose a Destination Profile

The next step involves linking the corresponding fields between your CSV file and Salesforce. Simply drag your cursor from a field on the left side and connect it to its matching field on the right. For example, in the diagram below, the following fields are mapped together:

  • booking_id -> Booking_ID_c
  • customer_name -> Name
  • customer_email ->Email_c
  • customer_company -> Company_c

 Once you've mapped all the fields, your page should resemble the example provided.

The Mapped values from Map source to the Map destination fields
The Mapped values from Map source to the Map destination fields

Next, click the Save and Close button to save this Map and return to the sidebar. 

Then, click the OK button on the sidebar to save your Map Shape configuration and return to your Boomi process canvas. 

Finally, add an End and Continue shape to signify the end of the process.

The entire AWS to Salesforce ETL  Boomi process
The entire AWS to Salesforce ETL  Boomi process

6. Testing the Boomi process

To verify if the process is working, click the Arrange button, then save and test the process.

Next, click on the Salesforce application and navigate to the Shape Source Data tab below.

The Shape Source Data tab
The Shape Source Data tab

You will see the data packets; click on one of these packets to see the data in your database.

The Document Viewer modal
The Document Viewer modal

After completing the process, your database data will be visible in your Salesforce application.

The Salesforce Accounts tab
The Salesforce Accounts tab

That's a Wrap

This guide has shown you how to build a basic ETL process using Boomi. You learned how to extract data from your AWS S3 bucket, check if it was modified today, transform it into an XML file format, and finally load it into your Salesforce application for easy access.

Read more