×
๐ CSV Generator Pro - Help Guide
๐ฏ Field Types
Over 30 field types available including: ID, Name, Email, Phone, Address, City, State, Country, ZIP, Company, Job Title, Department, Status, Category, Date, Revenue, Price, Quantity, SKU, URL, IP Address, Username, Boolean, Rating, Priority, and more.
๐ฒ Generation Controls
Rows: Number of data rows to generate (1-500,000)
Format: Output format - CSV or NDJSON
Filename: Name for the output file (used for downloads and S3 uploads)
Generate Data: Create the data based on selected fields and settings
๐
Date Controls
Fixed Date: Use the same date for all records
Random Dates: Generate random dates within specified range
๐ ID Generation Types
Sequential: Traditional incrementing IDs (1, 2, 3...)
Deterministic: Hash-based IDs that remain consistent across datasets for the same person/entity
Why Deterministic? Enables SQL JOINs across multiple datasets (e.g., same person ID in both Employee and Customer tables)
Methods:
- Auto-detect: Smartly uses best available fields (email > phone > date)
- Basic: Uses firstName + lastName only
- Standard: Uses firstName + lastName + email (recommended)
- Enhanced: Uses firstName + lastName + email + date
ID Range: Controls the maximum ID value (100K, 1M, or 10M)
๐งฎ ID Calculator: Calculate what someone's ID would be without accessing the dataset! Enter their details and get their deterministic ID - perfect for lookups when you don't have the original data.
โ ๏ธ Case Insensitive: IDs are case-insensitive - "John Smith" and "john smith" produce the SAME ID. All inputs are normalized to lowercase before hashing.
๐ Data Preview & Manipulation
Pagination: Navigate through data with First/Previous/Next/Last buttons. Select rows per page (5, 10, 25, 50, 100, 250, 500).
Multi-Column Sort: Click column header to sort. Shift+Click to add secondary/tertiary sorts. Visual indicators show sort order (โฒโผ) and priority (โ โกโข).
Search & Filter: Type in search box to filter all columns. Filter count shown when active.
Inline Editing: Click any cell to edit value (changes not saved to file automatically).
โ๏ธ Configuration Management
Save: Save current field selection and settings as a preset.
What's Saved: Field selections, row count, output format, date controls, S3 directory/filename, split settings (by date/fields), append timestamp, and ID generation settings.
Load Preset: Load saved configuration from dropdown.
New Preset: Start fresh configuration with all fields cleared.
Delete: Remove selected preset.
Clear All: Delete ALL saved presets (requires confirmation).
Import/Export: Share configurations via JSON file.
โ๏ธ AWS S3 Upload
Basic Upload: Provide bucket, region, credentials, and directory path.
Directory Path: Specify only the directory path (filename is set in Generation Controls).
Date Placeholders: Use yyyy, mm, dd for date-based paths.
- Split by Date UNCHECKED: Date placeholders use today's date
- Split by Date CHECKED: Date placeholders use dates from records (creates multiple files)
Field Placeholders: Use {{fieldName}} for field-based grouping.
Examples:
data/year=yyyy/month=mm/ โ data/year=2025/month=01/ (today's date)
users/{{country}}/ โ users/USA/, users/Canada/
events/{{status}}/yyyy-mm/ โ events/Active/2025-01/
Split Options:
- Split by Date: Creates separate files for each unique date in records
- Split by Fields: Groups by field values in path using {{field}} placeholders
- Both: Maximum partitioning flexibility
- Step-by-step: Manual control over each upload for debugging
- Console Logging: Enable detailed logging to browser console
- ๐พ Note: Split settings are saved with each configuration preset and restored when loaded
๐ง Lambda Partitioning (Advanced)
Purpose: Generate a single bulk Parquet file with embedded metadata that an AWS Lambda function can read and automatically partition into multiple files.
Requirements:
- Output format must be Parquet
- S3 directory path should include partition placeholders
- Lambda function must be set up to read partition metadata
How it Works:
- Check "Enable Lambda Partitioning" in S3 Upload section
- Generator creates ONE large Parquet file with ALL records
- Partition configuration is embedded as metadata in the file
- File is uploaded to S3 root or specified directory
- Lambda function reads metadata and partitions the file
Partition Template Examples:
data/year=yyyy/category={{category}}/month=mm/
events/{{status}}/yyyy-mm-dd/
sales/{{region}}/year=yyyy/
Embedded Metadata Includes:
- Partition template (converted to Lambda-friendly format)
- Partition columns (extracted from placeholders)
- Creation date and record count
- Version information
Benefits:
- Single upload instead of thousands of small files
- Faster upload process
- Lambda handles complex partitioning logic
- Better for large datasets (100k+ records)
โ ๏ธ Note: When Lambda Partitioning is enabled, Split by Date and Split by Fields are automatically disabled as the Lambda function handles all partitioning.
๐ Batch Processing
Batch Upload All Configs: Automatically process and upload all saved configurations in sequence.
Requirements: Each config must have both rowCount and outputFormat defined.
Pause for Confirmation: When checked, the batch process will pause before each upload, allowing you to:
- Review the generated data in the preview table
- Verify the configuration is correct
- Click Continue to upload or Stop to cancel
Process Flow:
- Loads each configuration from the dropdown in order
- Restores all saved settings including split by date/fields
- Generates data according to config settings
- If pause mode enabled, waits for Continue confirmation
- Uploads to S3 (respects each config's individual split settings)
- Moves to next configuration
Progress Tracking: Shows current config, success/failed/skipped counts
Stop Batch: Cancel batch processing at any time
Console Logging: Automatically enabled during batch processing for full visibility
โ๏ธ Per-Config Settings: Each config remembers its own split settings - one config can split by date while another doesn't
๐ง Tips & Tricks
- Use "Select Common" for quick setup of typical fields
- Enable Console Logging to debug S3 uploads
- Step-by-step mode lets you control each S3 upload manually
- Export configs to share with your team
- Use field placeholders to organize S3 data efficiently