Azure Queue Storage ETL Task
-
The RavenDB Azure Queue Storage ETL task -
- Extracts selected data from RavenDB documents from specified collections.
- Transforms the data into JSON object.
- Wraps the JSON objects as CloudEvents messages and Loads them to an Azure Queue Storage.
-
The Azure Queue Storage ETL task transfers documents only.
Document extensions like attachments, counters, time series, and revisions are not sent.
The maximum message size in Azure Queue Storage is 64KB, documents larger than this won’t be loaded. -
The Azure Queue Storage enqueues incoming messages at the tail of a queue. Azure Functions can be triggered to access and consume messages when the enqueued messages advance to the queue head.
-
This page explains how to create an Azure Queue Storage ETL task using the Studio.
Learn here how to define an Azure Queue Storage ETL task using the Client API. -
In this page:
Open Azure Queue Storage ETL task view
Add Ongoing Task
- Ongoing Tasks
Click to open the ongoing tasks view. - Add a Database Task
Click to create a new ongoing task.
Define ETL Task
Define Azure Queue Storage ETL task
Define Azure Queue Storage ETL Task
-
Task Name (Optional)
- Enter a name for your task
- If no name is provided, the server will create a name based on the defined connection string name,
e.g. "Queue ETL to <ConStrName>"
-
Task State
Select the task state:
Enabled - The task runs in the background, transforming and sending documents as defined in this view.
Disabled - No documents are transformed and sent. -
Set Responsible Node (Optional)
- Select a node from the Database Group to be responsible for this task.
- If no node is selected, the cluster will assign a responsible node (see Members Duties).
-
Create new Azure Queue Storage connection String
- The connection string contains the necessary information to connect to an Azure storage account.
Toggle OFF to select an existing connection string from the list, or toggle ON to create a new one. - Name - Enter a name for the connection string.
- Authentication method - Select the authentication method by which to connect to an Azure storage account.
Learn more about the available authentication methods below.
- The connection string contains the necessary information to connect to an Azure storage account.
-
Test Connection
After defining the connection string, click to test the connection to the Azure storage account. -
Add Transformation Script
The sent data can be filtered and modified by multiple transformation JavaScript scripts that are added to the task.
Click to add a transformation script. -
Advanced
Click to open the advanced section where you can configure the deletion of RavenDB documents per queue.
Authentication method
The available authentication methods to an Azure storage account are:
-
Connection String
- A single string that includes all the options required to connect the Azure storage account.
Learn more about Azure Storage connection strings here. -
The following connection string parameters are mandatory:
AccountName
AccountKey
DefaultEndpointsProtocol
QueueEndpoint
(when using http protocol)
Connection string method
- A single string that includes all the options required to connect the Azure storage account.
-
Entra ID
- Use the Entra ID authorization method to achieve enhanced security by leveraging Microsoft Entra’s robust identity solutions.
- This approach minimizes the risks associated with exposed credentials commonly found in connection strings and enables more granular control through Role-Based Access Controls.
Entra ID method
-
Passwordless
- This authorization method requires the machine to be pre-authorized and can only be used in self-hosted mode.
- Passwordless authorization works only when the account on the machine is assigned the Storage Account Queue Data Contributor role; the Contributor role alone is inadequate.
passwordless method
Options per queue
You can configure the ETL process to delete documents from RavenDB that have already been sent to the queues.
Advanced options
- Add queue options
Click to add a per-queue option. - Queue Name
Enter the name of the Azure Queue Storage the documents are loaded to.
Note: The queue name defined in the transform script must follow the set of rules outlined in:
Naming Queues and Metadata. - Delete processed documents
Enable this option to remove documents that were processed and loaded to the Azure Queue Storage from the RavenDB database. - Delete queue option
Click to delete the queue option from the list.
Add transformation script
Add or edit transformation script
- Add transformation script
Click to add a new transformation script that will process documents from RavenDB collection(s). - Edit transformation script
Click to edit this script. - Delete script
Click to remove this script.
Define transform script
-
Script Name - Enter a name for the script (Optional).
A default name will be generated if no name is entered, e.g. Script_1. -
Script - Edit the transformation script.
Sample script:{ // Define a "document object" whose contents will be extracted from RavenDB documents // and sent to the Azure Queue Storage. e.g. 'var orderData': var orderData = { // Verify that one of the properties of this object is given the value 'id(this)'. Id: id(this), // Property with RavenDB document ID OrderLinesCount: this.Lines.length, TotalCost: 0 }; for (var i = 0; i < this.Lines.length; i++) { var line = this.Lines[i]; var cost = (line.Quantity * line.PricePerUnit) * ( 1 - line.Discount); orderData.TotalCost += cost; } // Use the 'loadTo<QueueName>' method // to transfer the document object to the Azure Queue destination. loadToOrders(orderData, { // Load to a Queue by the name of "Orders" with optional params Id: id(this), Type: 'com.github.users', Source: '/registrations/direct-signup' }); }
-
Syntax - Click for a transformation script syntax Sample.
-
Collections
- Select (or enter) a collection
Type or select the names of the RavenDB collections your script is using. - Collections Selected
A list of collections that were already selected.
- Select (or enter) a collection
-
Apply script to documents from beginning of time (Reset)
- This toggle is available only when editing an existing script.
- When this option is enabled:
The script will be executed over all existing documents in the specified collections the first time the task runs. - When this option is disabled:
The script will be executed only over new and modified documents.
-
Add/Update - Click to add a new script or update an existing script.
Cancel - Click to cancel your changes. -
Test Script - Click to test the transformation script.