DW

Lakehouse Notes

Replication Method Replication Method - Replication method to use for extracting data from the database. STANDARD replication requires no setup on the DB side but will not be able to represent deletions incrementally. CDC uses the Binlog to detect inserts, updates, and deletes. This needs to be configured on the source database itself. S3 Support in Apache Hadoop Apache Hadoop ships with a connector to S3 called “S3A”, with the url prefix “s3a:“; its previous connectors “s3”, and “s3n” are deprecated and/or deleted from recent Hadoop versions.