Welcome to this video about using natural language to create data visualizations with built-in agents in LangChain. After watching this video, you'll be able to Identify when to use the LangChain Pandas agent Explain how the LangChain Pandas agent differs from other agents Describe how to set up the LangChain Pandas agent with an IBM watsonx.ai model Explain how to use natural language to analyze and visualize data using a Pandas DataFrame Summarize how the agent-generated Python code connects to the data outputs And identify best practices for safely prompting and using AI tools for data analysis tasks This dynamic method of running code is ideal for exploration and rapid prototyping, but is not recommended for production environments unless comprehensive safeguards are in place. These innovative features are currently available within LangChain's langchain-experimental package. Let's get started. The create_pandas_dataframe_agent works just like other LangChain agents with a few key differences. When you use the create_pandas_dataframe_agent, this agent uses a pre-configured set of functions and prompts, saving time and effort. Next, this agent operates on the existing Pandas DataFrame you provided. The user inputs a natural language prompt, and then the agent responds with the appropriate answer, whether the answer is a value, summary, or visualization. Next, explore how to set up and use this built-in agent. First, import pandas as pd. Next, you'll load the DataFrame object, or df. In this instance, you will use the Student Alcohol Consumption CSV Formatted dataset by UCI Machine Learning. You can display the dataset as a table. In this instance, you can use the df.head command to display the headings and the first five rows of data. Your analysis will focus on the sex and age parameters. In this instance, sex is synonymous with gender and is represented by M for male and F for female, and the student's ages are displayed as numeric values between 15 and 22. Next, set up their credentials, including the model ID and generation parameters. From ibm_watsonx_ai.metanames, import GenTextParamsMetaNames as GenParams and create a dictionary to store credential information. Specify a model ID for the Llama 3 70B model, define the generation parameters to initialize the model, and use the model via IBM watsonx.ai for text generation. You can also include additional configuration settings, such as the number of generation tokens to use. An important note, models available on watsonx.ai can change. Always check for the latest models on the watsonx API or any other LLM host APIs. Next, let's load a watsonx LLM and connect the LLM to LangChain. To load and connect the model to LangChain, begin by setting up the watsonx LLM by importing the model class from IBM's Foundation Models and watsonx LLM from LangChain's watsonx extension. Next, provide the previously configured necessary credentials. Here you see the model ID, parameters, project ID, and space ID. This step connects your model instance with the required configurations. Then specify watsonx LLM as the LLM model. This action integrates watsonx LLM within LangChain to build chatbots, chains, and tools using LangChain's features. And always check the latest LangChain documentation, as the specific syntax can evolve with product updates. Now set up the Pandas DataFrame agent from LangChain. First, import create_pandas_dataframe_agent. Next, create a DataFrame agent that connects the LLM to a DataFrame to answer data questions using natural language. Then pass the LLM, the DataFrame (df), and the standard Pandas object created earlier. Set verbose to true if you want to see more details while the code runs. Set return_intermediate_steps=True to view the code generated along the way. Great for debugging or understanding the logic. Now let's learn how to implement natural language. You can ask a natural question, such as, "how many rows are in this file?" Using the invoke method, and the Pandas agent returns the answer instantly: 395 rows. If you're curious, you can view the code. You can inspect these steps in the intermediate_steps parameter, which displays the generated code, including DataFrame filters and aggregation commands. You'll see the exact code the LLM used, such as len(df). This example asks the agent to output an answer to the question, "How many students are 18 years old?" The agent responds, "There are 82 students who are 18 years old." To see the code that produced this answer, you can view the intermediate_steps parameter. You will see that the code filtered the DataFrame for rows with an age of 18, and then counted the number of matching rows. You can also generate visualizations using natural language. Verbalize or type your natural language prompt that requests the information you want, such as plot the gender count with bars. And the LangChain Pandas agent generates clear, insightful charts and seconds. No coding needed by you or another user. You will use simple, natural instructions. Although you used the word gender in your request, the LLM understood that you were requesting data from the column labeled sex and located the information in the dataset. Now that you've seen the LangChain Pandas agent in action, here are some practices to help you safely obtain the best results. Follow these practices to obtain the best results. Always use sandboxed environments to prevent unintended modifications to live data and avoid prompt injection risks that could result in the running of malicious code. To effectively and safely prompt and use AI tools for data analysis, design clear and specific prompts to avoid ambiguous responses. Then combine LLM analysis with human expertise to validate the results, such as validating that the LLM analyzed the correct data and returned the correct results, and iteratively refine your prompts and analysis. Now let's recap what you've learned. You now know that using the LangChain Pandas agent is ideal for exploration and rapid prototyping but is not recommended for production environments unless comprehensive safeguards are in place. The LangChain Pandas agent is different because it uses preconfigured functions and prompts, operates on your existing Pandas dataframe, accepts prompt inputs from users, and responds to prompts with answers or visuals. You can set up the LangChain Pandas agent to work with an external LLM such as IBM watsonx.ai model. To set up the LangChain Pandas agent with an IBM watsonx.ai model, first initialize your model with watsonx LLM, then connect it to a Pandas dataframe using the create_pandas_dataframe_agent function. You can analyze and visualize data by asking natural language questions to the Pandas dataframe agent, which instantly generates code and returns clear data insights. The agent-generated Python code directly interacts with your dataframe, filtering, aggregating, and visualizing data based on your natural language prompts. And always use sandboxed environments, design clear prompts, validate LLM analysis with human expertise, and iteratively refine your queries for safe and effective AI-driven data analysis.