Overly large Python Jupyter Notebook (.ipynb) files

In summary, the large files may be due to data or images included in the file. The easiest way to save the file for smaller size is to restart the jupyter notebook process and clear the output before saving the file.
  • #1
WWGD
Science Advisor
Gold Member
7,420
11,424
TL;DR Summary
For some strange-seeming reasons, some of my Python files are extremely larger than others with similar content
Hi all,
I was looking up my virtual file manager in Python Jupyter and in the listing of notebooks; all similar to each other in size and scope , some files
stand out in terms of size for no apparent reasons. I have some 40 notebooks ; all- but- 2 ranging from 10kb to 284kb at the extremes , and two others, notebooks as well, with the same .ipynb extension with sizes 77.3 mb and 91mb respectively. As I said, the latter two are very similar to the other 38: regular notebooks with Python code. Why would these two files be so much larger than the other 38?

EDIT: I hope its ok to post an additional question in the same post:
How do we do a Ctrl +alt+ Delete within a Python Notebook? The think is this notebook is part of
a virtual back end server and not part of the physical machine.
 
Last edited:
Technology news on Phys.org
  • #2
WWGD said:
Summary:: For some strange-seeming reasons, some of my Python files are extremely larger than others with similar content

Hi all,be so
I was looking up my virtual file manager in Python Jupyter and in the listing of notebooks; all similar to each other in size and scope , some files
stand out in terms of size for no apparent reasons. I have some 40 notebooks ; all- but- 2 ranging from 10kb to 284kb at the extremes , and two others, notebooks as well, with the same .ipynb extension with sizes 77.3 mb and 91mb respectively. As I said, the latter two are very similar to the other 38: regular notebooks with Python code. Why would these two files be so much larger than the other 38?
Do the large ones contain images from e.g. matplotlib?

WWGD said:
EDIT: I hope its ok to post an additional question in the same post:

WWGD said:
How do we do a Ctrl +alt+ Delete within a Python Notebook? The think is this notebook is part of
a virtual back end server and not part of the physical machine.
Windows, Linux, iOS...? Also how did you start Jupyter - from the command line, Anaconda Navigator...?

Try visiting http://localhost:8888/tree#running in a browser and selecting 'shutdown' on the appropriate process.

If Windows: try switching to the terminal window running the Jupyter Notebook process with Alt-Tab then hit Ctrl-C.
 
Last edited:
  • Like
Likes WWGD
  • #3
When you save the notebook, it saves the output of all of the cells, which may include large amounts of data or images. That's why the files are so large. If, before you save the file, you go to the Kernel tab and click "Restart and clear output", it will clear all of the output in the cells. Then when you save it, you will just be saving the code in the cells, not the output, which will probably make the file much smaller.
 
  • Like
Likes WWGD
  • #4
pbuk said:
Do the large ones contain images from e.g. matplotlib?

Windows, Linux, iOS...? Also how did you start Jupyter - from the command line, Anaconda Navigator...?

Try visiting http://localhost:8888/tree#running in a browser and selecting 'shutdown' on the appropriate process.

If Windows: try switching to the terminal window running the Jupyter Notebook process with Alt-Tab then hit Ctrl-C.
Thank you for your reply. I am using Windows 10 and not quite from either of your options. I enter 'Jupyter' into the search box, then the OS accesses the command line and gives me access through localhost 8888. I don't have any plots at all, I don't use anything other than python proper in my overly large files.
 
  • #5
WWGD said:
Thank you for your reply. I am using Windows 10 and not quite from either of your options. I enter 'Jupyter' into the search box, then the OS accesses the command line and gives me access through localhost 8888. I don't have any plots at all, I don't use anything other than python proper in my overly large files.
My largest python source code file (of 55 different source code files) is 2 KB. For my purposes of merely learning python syntax I don't need or use any IDE -- I open a Win 10 command prompt window and run python.exe from that window.
 
  • Like
Likes WWGD
  • #6
Mark44 said:
My largest python source code file (of 55 different source code files) is 2 KB. For my purposes of merely learning python syntax I don't need or use any IDE -- I open a Win 10 command prompt window and run python.exe from that window.
I am too used to the Jupyter notebook interface. But , thanks, I will consider that.
 
  • #7
WWGD said:
Thank you for your reply. I am using Windows 10 and not quite from either of your options. I enter 'Jupyter' into the search box, then the OS accesses the command line and gives me access through localhost 8888. I don't have any plots at all, I don't use anything other than python proper in my overly large files.
So does the http://localhost:8888/tree#running -> Shutdown method work for you?

There is no virtual machine involved, just jupyter running in the background running your python programs in a shell and a web server as an interface.
 
  • Like
Likes WWGD
  • #8
WWGD said:
I don't have any plots at all, I don't use anything other than python proper in my overly large files.
This code will generate a pretty large .ipynb file:
Infinite loop:
while True:
    print(1)
 
  • Like
Likes WWGD
  • #9
pbuk said:
So does the http://localhost:8888/tree#running -> Shutdown method work for you?

There is no virtual machine involved, just jupyter running in the background running your python programs in a shell and a web server as an interface.
Thanks, my bad. I meant a virtual server at local host. Thanks for the suggestion. I haven't gotten to my pc yet, will let you know.
 
  • #10
pbuk said:
This code will generate a pretty large .ipynb file:
Infinite loop:
while True:
    print(1)
Hmm.. I am remembering now I did several copies ( for practice) of an algorithm to print all primes in a given range. It was from 2 to around 10,000. Maybe that explains it.
 
  • #11
Still nothing virtual. It's all running in the same Windows kernel in multiple processes spawned by the Jupyter server.
 
  • Like
Likes WWGD
  • #12
WWGD said:
Hmm.. I am remembering now I did several copies ( for practice) of an algorithm to print all primes in a given range. It was from 2 to around 10,000. Maybe that explains it.
Ya think :wink:?

If you change the extension to .json and open it up in a browser you will probably be able to see all those primes. Change back to .ipynb to open up again in Jupyter.
 
  • Like
Likes WWGD
  • #13
Thanks again. Upon checking, I realized these were the days before I became (a bit more ) proficient with indentation issues Let's just say I have Ctrl+ C etched into my nervous system, to stop way too many infinite loops. Sadly, I am often still too impatient to sit down and write up the flowchart :(., so I keep repeating these indent mistakes at times. This means some of the prime printouts wrote out the same prime more than once. If I was disciplined-enough I would try to figure out the logic flaw. Will do it by this weeks end, when I will look into indenting more carefully. Thanks.
 

FAQ: Overly large Python Jupyter Notebook (.ipynb) files

1. Why are overly large Python Jupyter Notebook (.ipynb) files a problem?

Overly large .ipynb files can be a problem because they can take up a lot of storage space and can be slow to load and run. This can make it difficult to work with the file and can decrease efficiency.

2. How do I determine if my .ipynb file is too large?

The size of a .ipynb file can vary depending on the content, but a good rule of thumb is that anything over 10MB is considered large. You can check the file size by right-clicking on the file and selecting "Properties" (on Windows) or by using the "ls -l" command in the terminal (on Mac or Linux).

3. What are some ways to reduce the size of a .ipynb file?

One way to reduce the size of a .ipynb file is to remove unnecessary cells or code. This can be done manually or by using the "Clear All Outputs" option under the "Cell" menu in Jupyter Notebook. Another way is to save the file in a different format, such as .py, which can be smaller in size.

4. Can I split a large .ipynb file into smaller files?

Yes, it is possible to split a large .ipynb file into smaller files. This can be done by copying and pasting cells into new .ipynb files or by using the "nbconvert" command in the terminal to convert the file into multiple .ipynb files.

5. Are there any tools or plugins that can help with managing large .ipynb files?

Yes, there are several tools and plugins available that can help with managing large .ipynb files. For example, the "nbstripout" plugin can be used to remove unnecessary output data from a .ipynb file, reducing its size. The "nbzip" tool can be used to compress .ipynb files to make them smaller. Additionally, there are several extensions available for Jupyter Notebook that can help with managing and organizing large files.

Back
Top