Overseas access: www.kdjingpai.com
Ctrl + D Favorites

NotebookLlama is a fully open source tool, based on LlamaCloud technology, designed to help users manage documents and generate podcast-like audio content. It is an alternative to Google NotebookLM for researchers, students and business users. Users can upload documents, create knowledge bases, and extract key information through intelligent analytics.NotebookLlama also supports the conversion of document content into natural-sounding audio, making it easy for users to access information in mobile scenarios. The project is hosted on GitHub, with transparent code, strong community support, and a clear installation process for tech enthusiasts and professionals.

NotebookLlama: open source document knowledge management and audio generation tool-1

 

Function List

  • Document Upload and Management: Support for uploading documents in multiple formats (e.g. PDF) to build individual or team knowledge bases.
  • Knowledge Extraction and Summarization: Automatically analyze documents, extract core content and generate summaries through LlamaCloud technology.
  • Audio Generation: Convert document content into podcast-like audio with support for natural speech output.
  • Open Source and Customizable: The code is completely open source, users can modify or expand the function according to the demand.
  • Multi-platform support: Runs via Docker and Streamlit and supports local or cloud deployments.
  • Intelligent Search: Provides intelligent search based on document content to quickly locate information.

 

Using Help

Installation process

To use NotebookLlama, users need to complete the installation and configuration first. Below are the detailed installation steps:

  1. Cloning Codebase
    Run the following command in the terminal to clone the NotebookLlama project locally:

    git clone https://github.com/run-llama/notebookllama
    

    Go to the project catalog:

    cd notebookllama/
    
  2. Installation of dependencies
    utilization uv tool installs the necessary dependency packages:

    uv sync
    

    Ensure that you have Python and the uv. If you don't have it, install Python 3.8 or above first and pass the pip install uv mounting uvThe

  3. Configuring API Keys
    The project requires three API keys: OpenAI, ElevenLabs and LlamaCloud.The steps are as follows:

    • Open the project directory in the .env.example Documentation.
    • Get the API key:
      • OPENAI_API_KEY: Log in to the OpenAI platform and go to Account Settings to generate a key.
      • ELEVENLABS_API_KEY: Get it on the Settings page of the ElevenLabs website.
      • LLAMACLOUD_API_KEY: Visit the LlamaCloud dashboard to get the key.
    • Fill the key into the .env.example file and then rename the file:
      mv .env.example .env
      
  4. Run the initialization script
    Execute the following commands to create the LlamaCloud indexing and extraction agent:

    uv run tools/create_llama_extract_agent.py
    uv run tools/create_llama_cloud_index.py
    
  5. Starting services
    Start the Postgres and Jaeger services with Docker:

    docker compose up -d
    

    Start the MCP server:

    uv run src/notebookllama/server.py
    
  6. Running the Streamlit application
    Launches the Streamlit front-end interface:

    streamlit run src/notebookllama/Home.py
    

    mounting ffmpeg(if not already installed) to support audio functionality:

    • On Ubuntu:sudo apt-get install ffmpeg
    • On macOS:brew install ffmpeg
  7. Access to applications
    Open your browser and visit http://localhost:8751/You can start using NotebookLlama now.

Main Functions

Document uploading and knowledge base creation

  • procedure::
    1. Log in to the Streamlit interface and click the "Upload Document" button.
    2. Select PDF or other supported document format to upload to the system.
    3. The system automatically parses the document content and incorporates it into the knowledge base.
  • Functional Features::
    • Supports batch uploading, suitable for processing large amounts of research materials.
    • Document content is automatically indexed for subsequent search and analysis.

Knowledge Extraction and Summarization

  • procedure::
    1. Select the uploaded document in the interface.
    2. Click the "Extract Information" or "Generate Summary" button.
    3. The system analyzes the document and outputs key points, summaries, or Q&A content.
  • Functional Features::
    • Intelligent analysis based on LlamaCloud for accurate and concise extraction.
    • Supports user-defined extraction scope, e.g., extracting only a certain chapter.

Audio Generation

  • procedure::
    1. Select the document or summary content for which you need to generate audio.
    2. Click "Generate Podcast" button, the system calls ElevenLabs API to convert text to speech.
    3. Download the generated audio file or play it directly online.
  • Functional Features::
    • The audio is natural and smooth, close to the human podcast effect.
    • Supports multi-language voice output, suitable for internationalization needs.

Intelligent Search

  • procedure::
    1. Enter a keyword or question in the interface.
    2. The system returns relevant document fragments or answers.
  • Functional Features::
    • Search results are based on document content and are highly accurate.
    • Support for complex queries, such as "summarize the topic of a document .

caveat

  • Ensure that the network is stable and that the API calls require an internet connection.
  • If audio generation fails, check the ffmpeg Is it properly installed.
  • Regularly update the code base for the latest features:git pull origin mainThe

application scenario

  1. academic research
    Researchers can upload academic papers to quickly extract key information or generate summaries. The audio feature is suitable for listening to the content of the paper while commuting to improve efficiency.
  2. Business Analysis
    Enterprise users upload market reports or internal documents to build a knowledge base. Intelligent search and summarization features help to quickly locate key data to aid decision-making.
  3. Educational learning
    Students upload textbooks or handouts to generate summaries or audio for easy review. The audio feature is especially suitable for auditory learners.
  4. content creation
    Podcast creators can convert articles or notes to audio to quickly generate podcast content and save recording time.

 

QA

  1. What document formats does NotebookLlama support?
    Currently supports PDF, TXT and other common formats, and may be expanded to more formats in the future.
  2. Do I need to pay to use the API?
    Yes, the APIs for OpenAI, ElevenLabs and LlamaCloud require their respective paid accounts. Users will need to register and get the key themselves.
  3. Does local deployment require high-performance hardware?
    A typical home computer (8GB RAM, 4-core CPU) can run it. a Docker deployment requires about 10GB of disk space.
  4. How well does the audio generate speech?
    The voice, provided by ElevenLabs, is near human announcer level and supports multiple languages and tones.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

inbox

Contact Us

Top

en_USEnglish