Skip to main content

Posts

Building a Data Lake on Amazon Simple Storage Service

Amazon Simple Storage Service (S3) is a cloud-based data storage service that stores data in its native format. Data durability of S3 is always at a high of 99.999999999 (11 9s), and the data regardless of the volume is stored in a fully secured and safe ecosystem. In Amazon S3, data files that contain metadata and objects are stored in buckets for uploading. For metadata and files, the object is to be uploaded to S3. After this step, permissions can be granted on the metadata or related objects stored in the buckets. Many competencies can be used when an S 3 data lake   is built on Amazon S3. These include media data processing applications, Artificial Intelligence (AI), Machine Learning (ML), big data analytics, and high-performance computing (HPC). When all these are used in conjunction, businesses get access to critical data, business intelligence, and analytics from S3 data lake and unstructured data sets. There are several benefits of the S3 data lake. The first is different c
Recent posts

The Evolution of Technology of Oracle Change Data Capture

Oracle change data capture ( CDC) was first launched with the 9i version as an in-built tool of the Oracle database. It was a tool that recorded and monitored all changes made in the user tables in a database. These changes were then stored in change tables and used in ETL applications for later processing and transferring to other data warehouses and databases. The release version of Oracle change data capture   had triggers placed in the source database. However, database administrators found this technology very invasive and did not favor it. Ultimately, Oracle changed the Oracle change data capture   technology and released it with the 10g version after naming it Oracle Streams.  The working of this release was different. Oracle change data capture   used the redo logs of the source database along with a replication tool of Oracle Streams. This technology turned out to be very successful and a highly optimized method to identify and move change data to a target database without af

All That You Need to Know About SQL Server CDC

  In the modern business environment, data is to be protected data from breaches. Keeping this in mind, major database solution providers like Oracle and Microsoft had launched various initiatives like triggers, complex queries, timestamps, and data audits. The first player was Microsoft when in 2005 it launched SQL Server Change Data Capture (CDC) with “after date”, “after delete”, and “after insert” features. A modified version, introduced in 2008 and still being usedcan monitor and capture any changes made to the SQL Server database. The functioning of the SQL Server change data capture   feature is not a complex one. All changes like insert, update, and deletemade to a SQL Server tableare captured by Change Data Capture which then enters the details of the modifications in a user-friendly relational format. Information about metadata and column structure necessary to apply changes to the target database are captured for the changed rows and stored in change tables. These tables re

Snowflake Data Lake – A Cloud-based Data Storage Solution

  Data lakes are data storage repositories that ensure massive volumes of data in its native form – structured, semi-structured, or unstructured – can be stored for processing at a later date. In the past, data storage solutions had various components like data warehouses, data marts, and more. But now, with cloud-based platforms like the Snowflake data lake, all these are not required. Snowflake data lake   is a high-performing cloud-based solution that provides unlimited storage and computing capabilities. Users have the option of scaling up or down in data storage usage and pay only for the resources used. This is critical for businesses as there is no need to invest in additional hardware and software to meet extra storage requirements whenever there is a spike in demand. Another advantage of the Snowflake data lake   is that it has optimized computing capabilities. Even when multiple users simultaneously execute several intricate queries, there is hardly any drop in speed or perf

Tasks performed by the SAP ETL Tool

  ETL is the process of extracting, transforming, and loading data from multiple sources into a centralized data repository with the ETL tool being able to extract data in its native format. On the other hand, SAP is a software system that helps to process data and ensures that the ideal flow is maintained to optimize business efficiencies. The full package has ERP, database systems, application servers, and technology stacks. Multiple tasks can be performed by the SAP ETL tool. Not only does it integrate different systems and transforms data formats to match each other but also helps to move data to and from the SAP ecosystem. The SAP ETL tool   also verifies whether the value of a name has been specified. The most critical advantage here is that data can be extracted and transformed externally even outside the application. There are several reasons why most organizations in this modern data-driven environment prefer to use the SAP ETL tool. The first is that once the SAP ETL too

The Change Data Capture (CDC) Feature in Microsoft SQL Server

  Several issues are faced by organizations today in the areas of data security and safety and ramping up systems for preservation of historical data. Leading database platforms took steps in this regard by launching data audits, timestamps, complex queries, and triggers, one of them being Microsoft. It led the innovation when in 2005, it introduced the SQL Server CDC   with the “after date”, “after delete”, and “after insert” features. SQL Server CDC   captures and records all activities like insert, update, or delete that are applied to a SQL Server table. Changes made are available in a user-friendly relational format and metadata and information that are required for posting changes to the target databases are captured in modified rows. These are stored in change tables with the same structure as the columns in the tracked source tables. SQL Server CDC   also tracks and records changes in the mirrored tables with column structures that are the same as the source tables. This is exc

What is Core Data Service View in SAP

SAP CDS (SAP Core Data Service) is a data modeling structure that defines and consumes data models that are not on the application server but on the database server. Certain key needs are met by the powerful in-memory database of SAP HANA. These include fast real-time performance for optimized activities at the database level, quick retrieval of data, and reduced application execution time. Developers use the SAP CDS to create underlying data models and application services for exposure to UI clients as CDSView SAP. CDS View SAP   is a virtual data model of SAP. It allows direct access to the underlying tables of the HANA database. CDS View SAP   was introduced by SAP with a new programming model that focused on pushing logic from the application server to the client-side and database and is known as ‘Code -to-Data’ or ‘Code Pushdown’. CDS View SAP   gets the logic from the ABAP application and executes it on the database server instead of the application server. ABAP CDC Views are bas