Data
Why Data Warehouse and Business Intelligence Testing Are Crucial for Success
In today’s data-driven landscape, testing your data warehouse and Business Intelligence (BI) systems early and often is essential. Neglecting proper testing can lead to inaccurate results and sluggish system performance, which could force you to restart your BI project—wasting valuable time, resources, and money, while also risking poor business decisions and lost opportunities. Expert Perspectives… Read More
Comprehensive Guide to Exposure Data Audit for Personally Identifiable Information in SQL Server
As a Business Intelligence Architect or Developer, performing an Exposure Data Audit to identify Personally Identifiable Information (PII) within your SQL Server 2016 environment is essential. This process helps uncover potential data security risks and supports the implementation of robust, enterprise-grade security policies. Microsoft SQL Server 2016 represents a significant leap forward in database security… Read More
Why Choose File Storage in Data Warehouse Architectures?
In this article, we’ll explore the strategic role of file storage within data warehouse design patterns, particularly in cloud-based environments. Referencing Microsoft’s published data warehouse architecture, we’ll focus on the common practice of extracting data from source systems and storing it as files—often in Azure Blob Storage or Azure Data Lake—before loading it into the… Read More
How to Filter Records for the Current User in Power Apps
One of the most common questions asked during PowerApps training sessions is: Can I filter gallery records to show only those created by or assigned to the logged-in user? The good news is — absolutely, yes! Filtering records by user is not only possible, but it’s also a best practice for creating personalized and secure… Read More
Enhance PySpark Development with the AI Assistant in Databricks
In today’s data-driven world, efficient coding and quick debugging are crucial. Databricks’ AI Assistant offers a groundbreaking way to simplify PySpark development by helping you write, debug, and optimize code directly within the platform. In this tutorial, Mitchell Pearson walks through practical use cases of this intelligent tool, showing how it enhances productivity for data… Read More
How to Build Power Apps for Disconnected and Offline Use
Have you ever needed to use an app without internet or Wi-Fi but still wanted to save your data to a database? In this guide, I’ll explain how to design a Power Apps application that works seamlessly offline or in disconnected environments. This app stores data locally on your device and automatically syncs it to… Read More
Understanding Static Data Masking: A Powerful Data Protection Feature
Today, I want to introduce you to an exciting and relatively new feature called Static Data Masking. This capability is available not only for Azure SQL Database but also for on-premises SQL Server environments. After testing it myself, I’m eager to share insights on how this feature can help you protect sensitive data during development… Read More
Understanding Data Governance: The Essential Framework
Data security remains a top priority for organizations worldwide, and effective data governance policies are key to achieving this. In this first installment of our two-part series on data governance, we’ll explore the foundational concepts you need to know to build a strong data governance strategy. Understanding the Three Fundamental Pillars of Data Governance Data… Read More
The True Cost of Poor Data Quality – Infographic Insight
Bad data has become a widespread issue impacting businesses globally. It is committed to combating this problem with LegiTest, a cutting-edge solution designed to improve data accuracy and reliability. Below are eye-opening statistics that reveal how poor data quality affects organizations. The Expanding Challenge of Handling Vast Data Volumes in Modern Enterprises In today’s hyperconnected… Read More
The Core of Data Engineering — Foundations, Functions, and the Future
In an era where data has become the new currency, one of the most essential figures in any organization is the Data Engineer. They are the architects of data infrastructure, the builders of systems that turn raw inputs into actionable intelligence. Without them, the entire foundation of data-driven decision-making collapses. Every product recommendation, predictive insight,… Read More
The Certified Data Engineer Associate Role and Its Organizational Value
In a world where businesses generate and depend on massive volumes of information—from customer interactions and system logs to sensor readings and transactional data—the role of the data engineer has become mission‑critical. Among the credentials available to aspiring data professionals, the Certified Data Engineer Associate validates a range of technical and design skills essential for… Read More
How to Add Custom Libraries in Databricks
In this week’s Databricks mini-series, we’re focusing on how to integrate custom code libraries into Databricks environments. Databricks provides many pre-installed libraries within its runtime for Python, R, Java, and Scala, which you can find documented in the System Environment section of the release notes. However, it’s common for users to require additional custom libraries… Read More
Why Trimming Data is Crucial Before Removing Duplicates or Merging in Power Query Editor
In my recent blog and video tutorial, I demonstrated how to remove duplicate records in Power BI while retaining the most recent entry—assuming your data includes a date column. This scenario came up frequently during training sessions. You can watch the video below for detailed steps. Understanding the Challenge: When Remove Duplicates in Power BI… Read More
Understanding and Managing Slowly Changing Dimensions in Data Modeling
Data modeling remains a foundational concept in analytics, especially in today’s big data era. It focuses on identifying the necessary data and organizing it efficiently. One critical aspect of data modelling is managing Slowly Changing Dimensions (SCDs), which handle changes in dimension data over time. In the realm of data warehousing and business intelligence, managing… Read More
Understanding Cosmos DB: A Versatile Multi-Model Database Service
In this article, we’ll explore the multi-model capabilities of Azure Cosmos DB and what this means for managing your data effectively. A multi-model database enables you to store and work with data in various formats, tailored to your application’s needs. Cosmos DB currently supports four distinct data models, each accessible through dedicated APIs that allow… Read More