⚡ Quick Start Guide

Get started with CSV Generator Pro in minutes

Version 2.9.2

🆕 What's New in v2.9.2

Auto-Add Unknown Fields Feature

New checkbox in the Import Data section that automatically adds custom fields from imported files. Perfect for working with company-specific or legacy data formats!

  • Enabled: Automatically adds any fields not in the standard list
  • Disabled: Rejects files with incompatible fields (previous behavior)
  • Default: Enabled (checked) for maximum flexibility

🏃 30-Second Quick Start

⏱️ 30 seconds

Generate Your First Dataset

  1. Open csv-generator-pro.html in your browser
  2. Click "Select Common" button (selects frequently-used fields)
  3. Click "Generate Data" button
  4. Click "Download CSV" to save your file

✅ Done! You've just created a realistic dataset with 1000 rows.

📋 Common Tasks

⏱️ 1 minute

Task 1: Generate Customer Data

  1. Select these fields: id, firstName, lastName, email, phone, city, country, status
  2. Set rows to 500
  3. Click "Generate Data"
  4. Click "Download CSV"
⏱️ 1 minute

Task 2: Use a Built-in Preset

  1. Click the configuration dropdown
  2. Select "Sales Transaction Log"
  3. Click "Load Config"
  4. Click "Generate Data"
  5. Click "Download CSV"
⏱️ 2 minutes

Task 3: Import and Convert Files NEW

  1. Click "Choose File" in Import Data section
  2. Select your CSV, NDJSON, JSON, or Parquet file
  3. If file has custom fields:
    • ✅ Keep "Auto-add unknown fields" checked (default)
    • Custom fields will be automatically added
  4. Data loads automatically with fields selected
  5. Choose new output format (CSV, NDJSON, or Parquet)
  6. Click "Download" to save in new format

Example: Import a Parquet file with custom fields like customer_id and order_total, then export as CSV for Excel analysis.

⏱️ 3 minutes

Task 4: Create Consistent IDs for SQL Joins

  1. Check "Enable Deterministic IDs"
  2. Select method: Standard (uses firstName + lastName + email)
  3. Select fields: id, firstName, lastName, email, date
  4. Generate 1000 rows
  5. Save as customers.csv
  6. Change fields to: id, firstName, lastName, email, product, price
  7. Generate 5000 rows (using same Standard method)
  8. Save as orders.csv

✅ Now you can JOIN these tables on the id field because the same person gets the same ID in both files!

⏱️ 5 minutes

Task 5: Upload to AWS S3 with Partitioning

  1. Fill in S3 credentials (bucket, region, access key, secret key)
  2. Set S3 Directory: sales/category={{category}}/year=yyyy/month=mm/
  3. Select fields including category and date
  4. Enable "Random Dates" with range 2024-01-01 to 2024-12-31
  5. Check "Split by Fields" and "Split by Date"
  6. Click "Generate Data"
  7. Click "Quick Upload to S3"

✅ Files will be organized into partitioned directories like:
sales/category=Electronics/year=2024/month=11/

🔑 Key Features Quick Reference

Import & Export

  • Import formats: CSV, NDJSON, JSON, Parquet
  • Export formats: CSV, NDJSON, Parquet
  • Auto-add fields: Automatically handle custom field names (v2.9.2)
  • Format conversion: Import any format, export to any format

Field Selection

  • 41+ field types covering personal, business, product, and technical data
  • Quick buttons: Select All, Deselect All, Select Common
  • Custom fields: Import files with any field names using auto-add feature

Data Generation

  • Row counts: 1 to 1,000,000 records
  • Date options: Fixed date or random date ranges
  • Deterministic IDs: Create consistent IDs for multi-table relationships
  • Deduplication: Automatic handling of duplicate records

AWS S3 Integration

  • Direct upload: No additional tools needed
  • Dynamic paths: Use {{field}} and yyyy/mm/dd placeholders
  • File splitting: Automatic split by date or field values
  • Hive partitioning: Create data lake structures like year=2024/month=11/
  • Test CORS: Validate bucket configuration before uploading

Configuration Management

  • 12 built-in presets: Customer lists, sales logs, product inventory, etc.
  • Save custom configs: Store your field selections and settings
  • Import/Export: Share configurations with team members
  • Batch processing: Automatically process multiple configs

🔄 Common Workflows

Workflow 1: Data Warehouse Testing

  1. Create customers table - 50,000 rows with deterministic IDs
  2. Create orders table - 200,000 rows with same ID method
  3. Create products table - 5,000 rows
  4. Upload all to S3 with Hive-style partitioning
  5. Test Athena queries with realistic data volumes

Workflow 2: Format Conversion

  1. Import your existing CSV file
  2. Custom fields? Auto-add feature handles them automatically
  3. Select Parquet as output format
  4. Download - your file is now 10-100x smaller
  5. Upload to S3 for use with Athena or Redshift

Workflow 3: Batch Data Lake Population

  1. Create configs for each table (customers, orders, products, etc.)
  2. Set S3 paths with partitioning for each config
  3. Enable split settings for large datasets
  4. Start batch upload - all tables generated and uploaded automatically
  5. Enable pause mode to review each dataset before upload

Workflow 4: Legacy Data Modernization

  1. Import old CSV file with custom field names
  2. Auto-add fields preserves all original columns
  3. Add standard fields if needed for enrichment
  4. Export as Parquet for modern data lake
  5. Upload to S3 with proper partitioning

💡 Pro Tips

  • Start small: Generate 100 rows first to verify your configuration
  • Use presets: The built-in configurations are optimized and ready to use
  • Enable console logging: See exactly what's happening during generation and upload
  • Test CORS first: Before uploading to S3, use the "Test CORS" button
  • Export configs: Back up your configurations regularly
  • Parquet for analytics: Use Parquet format for data warehouse uploads
  • Random dates for splitting: Always use random dates when splitting by date
  • Deterministic IDs: Use the same method across related tables for JOIN capability
  • Batch with pause: Enable pause mode in batch processing to review before upload
  • Auto-add for imports: Keep the checkbox enabled to handle any file format

❓ Getting Help

Need more details?

  • Complete Help Documentation - Comprehensive guide to all features
  • Documentation Hub - All documentation in one place
  • Console Logging: Enable it to see detailed information about operations
  • Built-in Examples: The 12 presets demonstrate best practices