• Product
  • Pricing
  • Docs
  • Using PostHog
  • Community
  • Company
  • Login
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
          • Horizontal scaling (Sharding & replication)
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Managing hosting costs
    • EU-only hosting
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Libraries
    • Badge
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Browser Extensions
      • Elixir
      • Go
      • Java
      • Node.js
      • PHP
      • Python
      • Ruby
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • Between Cloud and self-hosted
    • Overview
    • Tutorial
    • Troubleshooting
    • TypeScript types
    • Developer reference
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • Data model
    • Ingestion pipeline
    • ClickHouse
    • Querying data
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Data deletion
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
    • Overview
      • Amazon Kinesis Import
      • BitBucket Release Tracker
      • Braze Import
      • Event Replicator
      • GitHub Release Tracker
      • GitHub Star Sync
      • GitLab Release Tracker
      • Heartbeat
      • Ingestion Alert
      • Email Scoring
      • n8n Connector
      • Orbit Connector
      • Redshift Import
      • Segment Connector
      • Shopify Connector
      • Twitter Followers Tracker
      • Zendesk Connector
      • Airbyte Exporter
      • Amazon S3 Export
      • BigQuery Export
      • Customer.io Connector
      • Databricks Export
      • Engage Connector
      • GCP Pub/Sub Connector
      • Google Cloud Storage Export
      • Hubspot Connector
      • Intercom Connector
      • Migrator 3000
      • PagerDuty Connector
      • PostgreSQL Export
      • Redshift Export
      • RudderStack Export
      • Salesforce Connector
      • Sendgrid Connector
      • Sentry Connector
      • Snowflake Export
      • Twilio Connector
      • Variance Connector
      • Zapier Connector
      • Downsampler
      • Event Sequence Timer
      • First Time Event Tracker
      • Property Filter
      • Property Flattener
      • Schema Enforcer
      • Taxonomy Standardizer
      • Unduplicator
      • Automatic Cohort Creator
      • Currency Normalizer
      • GeoIP Enricher
      • Timestamp Parser
      • URL Normalizer
      • User Agent Populator
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
          • Horizontal scaling (Sharding & replication)
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Managing hosting costs
    • EU-only hosting
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Libraries
    • Badge
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Browser Extensions
      • Elixir
      • Go
      • Java
      • Node.js
      • PHP
      • Python
      • Ruby
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • Between Cloud and self-hosted
    • Overview
    • Tutorial
    • Troubleshooting
    • TypeScript types
    • Developer reference
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • Data model
    • Ingestion pipeline
    • ClickHouse
    • Querying data
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Data deletion
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
    • Overview
      • Amazon Kinesis Import
      • BitBucket Release Tracker
      • Braze Import
      • Event Replicator
      • GitHub Release Tracker
      • GitHub Star Sync
      • GitLab Release Tracker
      • Heartbeat
      • Ingestion Alert
      • Email Scoring
      • n8n Connector
      • Orbit Connector
      • Redshift Import
      • Segment Connector
      • Shopify Connector
      • Twitter Followers Tracker
      • Zendesk Connector
      • Airbyte Exporter
      • Amazon S3 Export
      • BigQuery Export
      • Customer.io Connector
      • Databricks Export
      • Engage Connector
      • GCP Pub/Sub Connector
      • Google Cloud Storage Export
      • Hubspot Connector
      • Intercom Connector
      • Migrator 3000
      • PagerDuty Connector
      • PostgreSQL Export
      • Redshift Export
      • RudderStack Export
      • Salesforce Connector
      • Sendgrid Connector
      • Sentry Connector
      • Snowflake Export
      • Twilio Connector
      • Variance Connector
      • Zapier Connector
      • Downsampler
      • Event Sequence Timer
      • First Time Event Tracker
      • Property Filter
      • Property Flattener
      • Schema Enforcer
      • Taxonomy Standardizer
      • Unduplicator
      • Automatic Cohort Creator
      • Currency Normalizer
      • GeoIP Enricher
      • Timestamp Parser
      • URL Normalizer
      • User Agent Populator
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Docs
  • Self-host
  • Runbook
  • Async migrations
  • 0003_fill_person_distinct_id2

Migration guide - 0002_fill_person_distinct_id2

Last updated: Sep 07, 2022

On this page

  • FAQ
  • Is it dangerous for this migration to be in an errored state?

0002_fill_distinct_id2 is an async migration added to to migrate the data from the old person_distinct_id table to the new person_distinct_id2 table.

This is needed for faster person_distinct_id queries as the old schema worked off of (distinct_id, person_id) pairs, making it expensive for our analytics queries, which need to map from distinct_id to the latest person_id.

The new schema works off of distinct_id columns, leveraging ReplacingMergeTrees with a version column we store in postgres.

We migrate teams one-by-one to avoid running out of memory.

The migration strategy:

1. Write any new updates to both tables
2. Insert all non-deleted (`team_id`, `distinct_id`, `person_id`) rows from `person_distinct_id` into `person_distinct_id2` (this migration)
3. Once migration has run, we only read/write from/to pdi2.

FAQ

Is it dangerous for this migration to be in an errored state?

No, the migration copies data to the new table, but that new table is not used until the migration has successfully completed.

Questions?

Was this page useful?

Next article

ClickHouse

ClickHouse® is an open-source, high performance columnar OLAP database management system for real-time analytics using SQL. We use it to store information like: event person person distinct id / session and to power all our analytics queries. This is a guide for how to operate ClickHouse with respect to our stack. Metrics As with any database it is important to keep an eye on metrics to make sure everything is in ship shape. Most of these metrics shouldn't be a surprise. The metrics you should…

Read next article

Share

Jump to:

  • FAQ
  • Is it dangerous for this migration to be in an errored state?
  • Questions?
  • Edit this page
  • Raise an issue
  • Toggle content width
  • Toggle dark mode
  • About
  • Blog
  • Newsletter
  • Careers
  • Support
  • Contact sales

Product OS suite

Product overview

Analytics
  • Funnels
  • Trends
  • Paths

Pricing

Features
  • Session recording
  • Feature flags
  • Experimentation
  • Heatmaps

Customers

Platform
  • Correlation analysis
  • Collaboration
  • Apps

Community

Discussion
  • Questions?
  • Slack
  • Issues
  • Contact sales
Get involved
  • Roadmap
  • Contributors
  • Merch
  • PostHog FM
  • Marketplace

Docs

Getting started
  • PostHog Cloud
  • Self-hosted
  • Compare options
  • Tutorials
  • PostHog on GitHub
Install & integrate
  • Installation
  • Docs
  • API
  • Apps
User guides
  • Cohorts
  • Funnels
  • Sessions
  • Data
  • Events

Company

About
  • Our story
  • Team
  • Handbook
  • Investors
  • Careers
Resources
  • FAQ
  • Ask a question
  • Blog
  • Press
  • Merch
  • YouTube
© 2022 PostHog, Inc.
  • Code of conduct
  • Privacy
  • Terms