Spark seetings in Microsoft Fabric

April 11, 2024

Micrsoft Fabric brings many teams in one platform - It joining Data Engineering, Data Science, and Reporting landscape in one platform.

Lakehouse is new concept in Faric and here it is.

Question is why we are going to use Lakehouse while Micrsoft do have existing data storage platform. They do have multiple options in Data Engineering area.

I believe Microsoft is now looking to operate on a fully managed compute platform that can support Data Engineering and Data Science experiences - Selecting Apache spark features and services, Microsoft Fabric started it's journey.

Fabric do using starter pools. With starter pools, we can expect rapid Spark session initialization, typically within 5 to 10 seconds, with no need for manual setup. Starter pools have Spark clusters that are always on and ready for your requests.

Starter pools are a fast and easy way to use Spark on the Microsoft Fabric platform within seconds. You can use Spark sessions right away, instead of waiting for Spark to set up the nodes for you. To customize a starter pool, you need admin access to the workspace.

To configure/settings, go to your workspace and click on Workspace settings, and then Data Engineering >> Spark setting.