Dynamic Dataset(s) Accessing
How to Create a file for multiple datafiles remote execution
The order datasets are added when a new experiment is created, corresponds to the order of the datasets in the list dataset_ids. Therefore, in the following case of a new experiment the first dataset is the “RAISE Consortium questionnaire (second…)” and the second dataset is the “Iris”.
The id of the first dataset is in dataset_ids[0] while the second dataset’s in dataset_ids[1].
Python
To read them in python, you have to import json, and follow the next example:
import jsondataset_ids = json.loads(os.getenv("RAISE_DATASET_ID_LIST"))xls = pd.ExcelFile(f"{dataset_ids[0]}/datafile.xlsx", engine = 'openpyxl')
To apply the same scripts in another experiment with same files types, you have to create the experiment having as first dataset in the experiment a dataset similar to the “RAISE Consortium questionnaire (second…)” and another one similar to the “Iris” dataset. This way the script will read the corresponding datasets dynamically.
Running an experiment locally
It is possible to prepare and test an experiment for the RAISE platform locally, without using the RAISE SDK.
RAISE datasets, distributed across multiple nodes, are each identified by a unique dataset_id. To ensure flexibility and enable code reuse across different experiments, datasets should be accessed using environment variables as explained above.
To simulate this behavior locally (without the SDK), follow these instructions
- In the development directory, create a .env file containing the dataset IDs you plan to use. For example:
RAISE_DATASET_ID_LIST=["dataset_id_1", "dataset_id_2"]
Ensure the dataset IDs are listed in the exact same order as selected on the RAISE portal, otherwise the dataset references in the code may not align correctly.
Datasets ids can be found in the portal by examine
- Load Environment variables
Install the python-dotenv package in the local environment:
pip install python-dotenv
Then, add the following lines at the beginning of the script to load environment variables from the .env file:
from dotenv import load_dotenvload_dotenv()
Loading the environment variables in this setup allows you to do so in the same way as when running an experiment on the RAISE platform and in the SDK.