Posts

Micrsoft Purview - Insider Risk Management

Image
With the emerge and vast acceptable of digitalization and as the digital landscape continues to grow, the risk landscape for organizations has been changed significantly. Earlier days, insider risk management team are part of the security team who make sure some kind of end-user training on data protection is in place and ensuring the security of corporate assets are in place.  Due to digitalization and as industry shift totally towards cloud era, the data size is keep growing and growing in exponential manner. More apps, more application organization is demanding and hence more monitoring and required tight control is required to be in place. Personally, I believe due to Covid, the work from home culture demands more the role of Chief Information Security Officer (responsible for data protection and manage the insider risk threats).  Controlling the security/threats/damages now is not limited to some modules/check points/assessments, it is now altogether a new era and hence Microsoft

Microsoft Purview - The New Road

Image
  Microsoft Purview - A comprehensive set of solutions comming by clubbing Azure Purview and Microsoft 365 Compliance products which help your organization govern, protect, and manage your data. You can see more insights of your data wherever it lives and gain full control data life cycle. Three Pillars Data security: Solutions include: Data Loss Prevention Information Barriers Information Protection Insider Risk Management Privileged Access Management By defining and applying DLP policies, you can identify, monitor, and automatically protect sensitive items - PI, and SPI. DLP detects sensitive items by using deep content analysis. DLP lifecycle - Plan, Prepare, and Deploy. DLP policies can be applied to data at rest, data in use, and data in motion in locations such as: Exchange Online email SharePoint sites OneDrive accounts Teams chat and channel messages Microsoft Defender for Cloud Apps Windows 10, Windows 11, and macOS (three latest released versions) devices On-premises reposit

Spark seetings in Microsoft Fabric

Image
Micrsoft Fabric brings many teams in one platform - It joining Data Engineering, Data Science, and Reporting landscape in one platform.  Lakehouse is new concept in Faric and here it is. Question is why we are going to use Lakehouse while Micrsoft do have existing data storage platform. They do have multiple options in Data Engineering area. I believe Microsoft is now looking to operate on a fully managed compute platform that can support Data Engineering and Data Science experiences - Selecting Apache spark features and services, Microsoft Fabric started it's journey. Fabric do using starter pools. With starter pools, we can expect rapid Spark session initialization, typically within 5 to 10 seconds, with no need for manual setup. Starter pools have Spark clusters that are always on and ready for your requests.  Starter pools are a fast and easy way to use Spark on the Microsoft Fabric platform within seconds. You can use Spark sessions right away, instead of waiting for Spark to

How to fix ModuleNotFoundError - No module named pymongo in Notebook

Image
One common issues while working with python or spark is getting error message which says - module not found. Module not found here points that proper configuration or installation is required in respect to libray level. Once this is done, application will able to find my required module. For example, while using below code in Fabric Notebook and trying to connect with MongoDB database and display records, getting no module found error message. Running the code, it is throwing error - ModuleNotFoundError - No module named pymongo To fix the abobe issue, it is required to install the required libries. #install the required packages ! pip install pymongo ! pip install certifi Once done, Azure notebook now able to connect with Mongo database, and getting the confirmation log from Azure. Collecting pymongo Downloading pymongo-4.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (676 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 676.9/676.9 kB 17.3 MB/s eta 0:00:00 a 0:00

Microsoft Fabric - Use Lakehouse to upload source

Image
Microsoft Fabric comes up with multiple capabilities/wings and one of it is Data Enginnering where you brings your data to next generation AI. In Data Enginnering platform, you are going to load your data, perform operation on your data to process it, and finally display your finetune data in nice way using Power BI capabilities. Tables are Files hold your data in Data Enginnering landscape.  Table will allow to hold data in table structure format while you can upload your data file using csv/json/parquet. Upload option is there to upload your file(s) into Microsoft Fabric.              Files uploading in Lakehouse Files uploaded in Lakehouse Simple way to display record is to create one Notebook and drag the file there, Fabric will create the code for you :) Click on the table data, and options are there to display the data in different format. For example, you can view your data in chart format (like bar/chart/pie).   Bar Format Pie Format Notebook where you can tricks using SQL to

Microsoft Fabric - Step-By-Step Learning Notes

Image
So far, what we have learned in Data Engineering area specific to Microsoft Azure Cloud - Starting with Azure Data Factory, and then Data Flow, Data Bricks, SCALA, Python, Azure Synapse, and finally we are landing to prepare visualization report using Micrsoft Power BI.  Azure provides inteligency applicable areas for handling data and prdiction - We use Micrsoft Machine Learning workspace and its services. Matured platform providing services and easily plug-and-play model. Micrsoft now presents Fabric - the BIG jump in this era - At enterprise level, this includes everything under one umbrealla - It brings together experiences such as Data Engineering, Data Factory, Data Science, Data Warehouse, Real-Time Analytics, and Power BI onto a shared SaaS foundation.                                                               (Picture from Mocrsoft Fabric site) To get into Fabric, Micrsoft offering trail version and good to get insight to have a look at it. Power BI users can skip trail, h