This page lists out the best practices that we recommend for Flat File Sources.
At Least One User ID per Source
Each file must have at least one user ID. This user ID can be the client’s first-party user ID (must be the same as sent in the web JS or app SDKs) preferable along with a digital identifier like MAID that can be used for user matching.
Accurate Country to Region Mapping
Ensure that the files uploaded belong to the correct region as per the country of the records. That is, Spanish data must not be uploaded to the US bucket. In case you have data or users belonging to multiple countries, start by splitting your files based on the countries (see Appendix below) and then upload the files to the appropriate region buckets to ensure data privacy and compliance.
When uploading multiple files under a source, ensure that all files are of the same file format and delimiter as defined at the time of creation. For example, if the source was created as a CSV file with the delimiter semicolon (;), then ensure that all the subsequent uploads to this source are of the same format. Failure to do this results in data corruption. Note that CSV formats must not have "" and they must be separated by a comma (,) only. Click here to download the sample file.
Casing of Column Names
When uploading multiple files under a source, ensure that the same attributes are always named the same, with the same casing. This ensures proper mapping at the collection level. Otherwise, it is treated as a new column.
If the file upload is not set up from the Collect UI but through other methods, ensure that the file upload happens to the exact path as mentioned in detail. Otherwise, it can impact the processing of this data and subsequently the reporting and segment creations.
Country Data in the File
It is mandatory that records are tied to a country in order to create data collections. Ensure that you always attach the country as a field in the files. Otherwise, the country has to be hardcoded later. It is recommended that you use alpha ISO 3 country codes for sending the country information.
Number of Fields
While there’s no restriction on the schema, it is recommended that you check the Zeotap catalogue and start with only the fields that are relevant to the source. You can create new sources for newer data points you want to capture. If new fields are sent for the existing sources, then ensure that you map the new fields to start ingesting the data.
Mentioned below are the validation to be performed on files, catalogue and mapping.
On File
Ask the customer to share a sample data file with the actual data. Validate the sample file to ensure the following points:
- Encoding type is one of the supported types
- File format matches the selected source option
- Delimiter matches the selected option
- No fields are repeated
- All columns must have a header (post-ingestion check if the Header column appears as _CX in the Preview section)
On Catalogue
After a source is created, ask the customer to push a sample data file with the actual data. The values passed for the field in the sample file must be in line with the catalogue definition. Validate the sample data against your catalogue to ensure the following points:
- Data type
- Attribute type
- Date field’s timestamp format is one of the acceptable formats
- Country enricher must be ISO 2 or ISO 3 or hardcoded
- Field name (casing) must be consistent across the Source catalogues as well as it must be repeated
On Mapping
Before saving the mapping ensure the following points:
- The fields are mapped to the correct Zeotap field.
- Enrichers are applied wherever required and as per the incoming value.
- Select Timestamp from the list of supported formats. Once selected, the system continues to expect the same format throughout the source’s lifetime, otherwise, the ingestion may fail.
Last modified on February 26, 2026