Spark seetings in Microsoft Fabric

Micrsoft Fabric brings many teams in one platform - It joining Data Engineering, Data Science, and Reporting landscape in one platform. 


Lakehouse is new concept in Faric and here it is.



Question is why we are going to use Lakehouse while Micrsoft do have existing data storage platform. They do have multiple options in Data Engineering area.

I believe Microsoft is now looking to operate on a fully managed compute platform that can support Data Engineering and Data Science experiences - Selecting Apache spark features and services, Microsoft Fabric started it's journey.

Fabric do using starter pools. With starter pools, we can expect rapid Spark session initialization, typically within 5 to 10 seconds, with no need for manual setup. Starter pools have Spark clusters that are always on and ready for your requests. 

Starter pools are a fast and easy way to use Spark on the Microsoft Fabric platform within seconds. You can use Spark sessions right away, instead of waiting for Spark to set up the nodes for you. To customize a starter pool, you need admin access to the workspace.

To configure/settings, go to your workspace and click on Workspace settings, and then Data Engineering >> Spark setting.





To create custom pool, click on the New Pool

 

 

 Configure the pool the way you want and finally click on Save to save the changes.

 


Environment tab will help you to select which runtime environment you want to use for Spark.



Similar like Spark, many Power BI admin functionalities are there as well. Will go walkthrough them in upcoming post.





Comments

Popular posts from this blog

How to fix Azure DevOps error MSB4126

How to create Custom Visuals in Power BI – Initial few Steps

How to fix Azure DevOps Error CS0579