In Power BI import mode (PREMIUM capacity), the dataset and storage size restrictions are based on compressed or uncompressed data?
Similarly when the model is loaded in memory, then the memory utilized is based on compressed or uncompressed data?
For the datasets it is compressed, for example. Source data sizes are 10 X 100mb csv formatted files (Total 1GB). When loaded into a dataset (assuming that the engine can compress at a 10 to 1 ratio) will result in a 100mb dataset in memory.
The Power BI/SQL Server Analysis Services (Tabular) engine is called Vertipaq. The best post about how it compresses is here.
For items in Dataflows, this will also be compressed, but it more of a basic ZIP style compression, and not as efficient. So the 10 example files could take up 300mb in this format.
Indeed 1GB data can get compressed to about 100MB and hence the size of pbix file is reduced to 100MB. However that is the storage size (after compression). This is not the same as size of model when it is loaded into memory. When the model in loaded into memory then the entire 1GB will get loaded into memory? Or 100MB? I am confused on this point. Any reference to this concept will be helpful.
Yes it will be further compressed by the veritpaq engine, so it will be smaller than the file storage. I recommend using Dax Studio and it metrics, to analyse the in memory dataset sizes
Please can you give me a reference that says the dataflow csv data is zipped/compressed.
If you change the pbix to zip you can see it, the best ref is gqbi.wordpress.com/2017/05/02/…