Skip to content

Blog

NVIDIA's Nemotron-3-8B-Chat-SteerLM: Empowering Conversational AI with Stateful Text Generation

Introduction

NVIDIA's Nemotron-3-8B-Chat

In the world of AI, language models have taken center stage for their ability to generate human-like text responses to a wide range of queries. NVIDIA's Nemotron-3-8B-Chat-SteerLM is one such model, offering a powerful tool for generative AI creators working on conversational AI models. Let's dive into the details of this model and understand how it works, its intended use, potential risks, and its unique feature of remembering previous answers.

Model Overview

Nemotron-3-8B-Chat-SteerLM is an 8 billion-parameter generative language model based on the Nemotron-3-8B base model. It boasts customizability through the SteerLM method, allowing users to control model outputs dynamically during inference. This model is designed to generate text responses and code, making it a versatile choice for a range of applications.

Intended Application & Domain

This model is tailored for text-to-text generation, where it takes text input and generates text output. Its primary purpose is to assist generative AI creators in the development of conversational AI models. Whether it's chatbots, virtual assistants, or customer support systems, this model excels in generating text-based responses to user queries.

Model Type

Nemotron-3-8B-Chat-SteerLM belongs to the Transformer architecture family, renowned for its effectiveness in natural language processing tasks. Its architecture enables it to understand and generate human-like text.

Intended User

Developers and data scientists are the primary users of this model. They can leverage it to create conversational AI models that generate coherent and contextually relevant text responses in a conversational context.

Stateful Text Generation

One of the standout features of this model is its statefulness. It has the ability to remember previous answers in a conversation. This capability allows it to maintain context and generate responses that are not just coherent but also contextually relevant. For example, in a multi-turn conversation, it can refer back to previous responses to ensure continuity and relevancy.

How the Model Works

Nemotron-3-8B-Chat-SteerLM is a large language model that operates by generating text and code in response to prompts. Users input a text prompt, and the model utilizes its pre-trained knowledge to craft a text-based response. The stateful nature of the model means that it can remember and consider the conversation history, enabling it to generate contextually appropriate responses. This feature enhances the conversational quality of the AI, making interactions feel more natural and meaningful.

Performance Metrics

The model's performance is evaluated based on two critical metrics:

  1. Throughput: This metric measures how many requests the model can handle within a given time frame. It is essential for assessing the model's efficiency in real-world production environments.

  2. Latency: Latency gauges the time taken by the model to respond to a single request. Lower latency is desirable, indicating quicker responses and smoother user experiences.

Potential Known Risks

It's crucial to be aware of potential risks when using Nemotron-3-8B-Chat-SteerLM:

  • Bias and Toxicity: The model was trained on data from the internet, which may contain toxic language and societal biases. Consequently, it may generate responses that amplify these biases and return toxic or offensive content, especially when prompted with toxic inputs.

  • Accuracy and Relevance: The model may generate answers that are inaccurate, omit key information, or include irrelevant or redundant text. This can lead to socially unacceptable or undesirable text, even if the input prompt itself is not offensive.

Licensing

The use of this model is governed by the "NVIDIA AI Foundation Models Community License Agreement." Users must adhere to the terms and conditions outlined in the agreement when utilizing the model.

Conclusion

NVIDIA's Nemotron-3-8B-Chat-SteerLM represents a significant advancement in generative AI for conversational applications. With its stateful text generation capability and Transformer architecture, it offers a versatile solution for developers and data scientists working in this domain. However, it's important to be mindful of potential biases and accuracy issues, as well as adhere to the licensing terms when utilizing this powerful AI tool.

Building an Indexing Pipeline for LinkedIn Skill Assessments Quizzes Repository

Creating an efficient indexing pipeline for the linkedin-skill-assessments-quizzes repository involves systematic cloning, data processing, indexing, and query service setup. This comprehensive guide will walk you through each step with detailed code snippets, leveraging the Whoosh library for indexing.

Pre-requisites

You want to create a new directory and name it linkedin_skill_assessments_quizzes, you need to first open the command prompt in the current working directory. To do this, you can use the following command in the command prompt.

cd path_to_your_current_working_directory
Replace path_to_your_current_working_directory with the actual path where you want to create the new directory.

Alternatively, on Windows, you can open the command prompt in the current working directory by clicking on the address bar and typing cmd, and pressing enter.

Once you are in the desired working directory, create a new directory named linkedin_skill_assessments_quizzes by executing the following command:

mkdir linkedin_skill_assessments_quizzes

Now navigate to this new directory by executing the following command:

cd linkedin_skill_assessments_quizzes

This is where you will be cloning the repository and creating the indexing pipeline.

Introduction

LinkedIn Skill Assessments Quizzes is a repository of quizzes for various skills. It contains MD files for each quiz. The repository is available on GitHub. The repository has over 27,400 stars and 13,600 forks. It is a popular repository that is used by many people to prepare for interviews and improve their skills.

Step 1: Cloning the Repository

Start by cloning the repository to your local environment. This makes the content available for processing.

git clone https://github.com/Ebazhanov/linkedin-skill-assessments-quizzes.git
Cloning the repository

Step 2: Converting the MD Files to JSON

Processing the data involves parsing the MD files converting them to JSON format to extract the relevant information. The following code snippet demonstrates how to extract the question, answer, image, and options from the MD files and save them in a JSON file. This is required for indexing the data which we will cover in the next step.

Add the following code to a file named process_data.py in the same directory where you cloned the repository.

import os
import json
import markdown2
import re

# Get the markdown files directory

cloned_repository_directory = r"C:\Users\Harminder Nijjar\Desktop\blog\kb-blog-portfolio-mkdocs-master\scripts\linkedin-skill-assessments-quizzes"

# Create a folder to store the JSON files

output_folder = os.path.join(cloned_repository_directory, "json_output")

# Create the output folder if it doesn't exist

os.makedirs(output_folder, exist_ok=True)

# Create a list to store data for each MD file

data_for_each_md = []

# Iterate through the Markdown files (\*.md) in the current directory and its subdirectories

for root, dirs, files in os.walk(cloned_repository_directory):
    for name in files:
        if name.endswith(".md"): # Construct the full path to the Markdown file
            md_file_path = os.path.join(root, name)

            # Read the Markdown file
            with open(md_file_path, "r", encoding="utf-8") as md_file:
                md_content = md_file.read()

            # Split the content into sections for each question and answer
            sections = re.split(r"####\s+Q\d+\.", md_content)

            # Remove the first empty section
            sections.pop(0)

            # Create a list to store questions and answers for this MD file
            questions_and_answers = []

            # Iterate through sections and extract questions and answers
            for section in sections:
                # Split the section into lines
                lines = section.strip().split("\n")

                # Extract the question
                question = lines[0].strip()

                # Extract the answers
                answers = [line.strip() for line in lines[1:] if line.strip()]

                # Create a dictionary for this question and answers
                qa_dict = {"question": question, "answers": answers}

                # Append to the list of questions and answers
                questions_and_answers.append(qa_dict)

            # Create a dictionary for this MD file
            md_data = {
                "markdown_file": name,
                "questions_and_answers": questions_and_answers,
            }

            # Append to the list of data for each MD file
            data_for_each_md.append(md_data)

# Save JSON files in the output folder

for md_data in data_for_each_md:
    json_file_name = os.path.splitext(md_data["markdown_file"])[0] + ".json"
    json_file_path = os.path.join(output_folder, json_file_name)
    with open(json_file_path, "w", encoding="utf-8") as json_file:
        json.dump(md_data, json_file, indent=4)

print(f"JSON files created for each MD file in the '{output_folder}' folder.")w

Step 3: Indexing the Data

After processing the data, you can index it to make it searchable. Indexing refers to the process of creating an index for the data. The following code snippet demonstrates how to index the data using the Whoosh library. This is how the indexing pipeline will work.

Add the following code to a file named indexing_pipeline.py in the same directory where you cloned the repository.

import os
import json
import whoosh
from whoosh.fields import TEXT, ID, Schema
from whoosh.index import create_in

# Define the directory where your processed JSON files are located

json_files_directory = r"C:\Users\Harminder Nijjar\Desktop\blog\kb-blog-portfolio-mkdocs-master\scripts\linkedin-skill-assessments-quizzes\json_output"

# Define the directory where you want to create the Whoosh index

index_directory = r"C:\Users\Harminder Nijjar\Desktop\blog\kb-blog-portfolio-mkdocs-master\scripts\linkedin-skill-assessments-quizzes\index"

# Create the schema for the Whoosh index

schema = Schema(
markdown_file=ID(stored=True),
question=TEXT(stored=True),
answers=TEXT(stored=True),
)

# Create the index directory if it doesn't exist

os.makedirs(index_directory, exist_ok=True)

# Create the Whoosh index

index = create_in(index_directory, schema)

# Open the index writer

writer = index.writer()

# Iterate through JSON files and add documents to the index

for json_file_name in os.listdir(json_files_directory):
    if json_file_name.endswith(".json"):
        json_file_path = os.path.join(json_files_directory, json_file_name)
        with open(json_file_path, "r", encoding="utf-8") as json_file:
            json_data = json.load(json_file) # Extract 'question' and 'answers' from the JSON file
            question = json_data.get("question", "")
            answers = json_data.get("answers", []) # Combine 'question' and 'answers' into a single field for searching
            content = f"{question} {' '.join(answers)}"
            writer.add_document(
                markdown_file=json_file_name,
                question=content,
                answers=answers, # Use the extracted 'answers' or an empty list if not present
            )

# Commit changes to the index

writer.commit()

print("Indexing completed.")

Step 4: Setting Up the Query Service

After indexing the data, you can set up a query service to search the index for a given search term. The following code snippet demonstrates how to set up a query service using the Whoosh library. This is how the query service will work.

import os
import json
import re
from whoosh.index import create_in, open_dir
from whoosh.fields import Schema, TEXT, ID
from whoosh.qparser import MultifieldParser
from whoosh.analysis import StemmingAnalyzer

# Define the schema for the index

schema = Schema(
question=TEXT(stored=True, analyzer=StemmingAnalyzer()),
answer=TEXT(stored=True),
image=ID(stored=True),
options=TEXT(stored=True),
)

def create_search_index(json_files_directory, index_dir):
    if not os.path.exists(index_dir):
        os.makedirs(index_dir)

    index = create_in(index_dir, schema)
    writer = index.writer()

    for json_filename in os.listdir(json_files_directory):
        json_file_path = os.path.join(json_files_directory, json_filename)
        if json_file_path.endswith(".json"):
            try:
                with open(json_file_path, "r", encoding="utf-8") as file:
                    data = json.load(file)
                    for question_data in data["questions_and_answers"]:
                        question_text = question_data["question"]
                        answer_text = "\n".join(question_data["answers"])
                        image_id = data.get("image_id")
                        options = question_data.get("options", "")
                        writer.add_document(
                            question=question_text,
                            answer=answer_text,
                            image=image_id,
                            options=options,
                        )
            except Exception as e:
                print(f"Failed to process file {json_file_path}: {e}")

    writer.commit()
    print("Indexing completed successfully.")

def extract_correct_answer(answer_text):
    # Use regular expression to find the portion with "- [x]"
    match = re.search(r"- \[x\].\*", answer_text)
    if match:
        return match.group()
    return None

def search_index(query_str, index_dir):
    try:
        ix = open_dir(index_dir)
        with ix.searcher() as searcher:
            parser = MultifieldParser(["question", "options"], schema=ix.schema)
            query = parser.parse(query_str)
            results = searcher.search(query, limit=None)
            print(f"Search for '{query_str}' returned {len(results)} results.")
            return [
                {
                    "question": result["question"],
                    "correct_answer": extract_correct_answer(result["answer"]),
                    "image": result.get("image"),
                }
                for result in results
            ]
    except Exception as e:
        print("An error occurred during the search.")
        return []

if __name__ == "__main__":
    json_files_directory = r"C:\Users\Harminder Nijjar\Desktop\blog\kb-blog-portfolio-mkdocs-master\scripts\linkedin-skill-assessments-quizzes\json_output" # Replace with your JSON files directory path
    index_dir = "index" # Replace with your index directory path

    create_search_index(json_files_directory, index_dir)

    original_string = "Why would you use a virtual environment?"  # Replace with your actual search term
    # Remove the special characters from the original string
    query_string = re.sub(r"[^a-zA-Z0-9\s]", "", original_string)
    query_string = query_string.lower()
    query_string = query_string.strip()
    search_results = search_index(query_string, index_dir)

    if search_results:
        for result in search_results:
            print(f"Question: {result['question']}")
            # Remove the "- [x]" portion from the answer
            print(f"Correct answer: {result['correct_answer'].replace('- [x] ', '')}")
            if result.get("image"):
                print(f"Image: {result['image']}")
            print("\n")
        print(f"Search for '{original_string}' completed successfully.")
        print(f"Found {len(search_results)} results.")
    else:
        print("No results found.")

Conclusion

Creating an efficient indexing pipeline for the 'linkedin-skill-assessments-quizzes' repository involves systematic cloning, data processing, indexing, and query service setup. This comprehensive guide has walked you through each step with detailed code snippets, leveraging the Whoosh library for indexing. You should now be able to query the index and get the results. The script will print the question, answer, and image (if available) for each result.

Since the data is indexed, you can easily search for a given term and get the results. This can be useful for finding the answers to specific questions or searching for a particular topic. You can also use the query service to create a web application that allows users to search the index and get the results.

References

OpenAI's Developer Conference: A New Era of AI Innovation

OpenAI DevDay

GPT-4 Turbo with 128K context: Breaking Boundaries in Language Modeling

OpenAI announced GPT4-Turbo at its November Developer Conference, a new language model that builds on the success of GPT-3. This model is designed to break boundaries in language modeling, offering increased context length, more control, better knowledge, new modalities, customization, and higher rate limits. As shown, GPT-4 Turbo offers a significant increase in the number of tokens it can handle in its context length, going from 8,000 tokens to 128,000 tokens. This represents a substantial enhancement in the model's ability to maintain context over longer conversations or documents. Compared to the standard GPT-4, this is a huge leap forward in terms of the amount of information that can be processed by the model.

The new model also offers more control, specifically in terms of model inputs and outputs, and better knowledge, which includes updating the cut-off date for knowledge about the world to April 2023 and providing the ability for developers to easily add their own knowledge base. New modalities, such as DALL-E 3, Vision, and TTS (text-to-speech) will all be included in the API, with a new version of Whisper speech recognition coming. Customization, including fine-tuning and custom models (which, Altman warned, won’t be cheap), and higher rate limits are also included in the new model, making it a comprehensive upgrade over its predecessors.



Multimodal Capabilities: Expanding AI's Horizon

GPT-4 Turbo with vision

GPT-4 now integrates vision, allowing it to understand and analyze images, enhancing its capabilities beyond text. Developers can utilize this feature through the gpt-4-vision-preview model. It supports a range of applications, including caption generation and detailed image analysis, beneficial for services like BeMyEyes, which aids visually impaired individuals. The vision feature will soon be included in GPT-4's stable release. Costs vary by image size; for example, a 1080×1080 image analysis costs approximately $0.00765. For more details, OpenAI provides a comprehensive vision guide and DALL·E 3 remains the tool for image generation.

GPT-4 Turbo with vision analyzing Old School RuneScape through the RuneLite interface GPT-4V(ision) analyzing Old School RuneScape through the RuneLite interface
import base64
import logging
import os
import time
from PIL import ImageGrab, Image
import pyautogui as gui
import pygetwindow as gw
import requests

# Set up logging to capture events when script runs and any possible errors.

log_filename = 'rune_capture.log' # Replace with your desired log file name
logging.basicConfig(
filename=log_filename,
filemode='a',
level=logging.INFO,
format='%(asctime)s - %(name)s - [%(levelname)s] [%(pathname)s:%(lineno)d] - %(message)s - [%(process)d:%(thread)d]'
)
logger = logging.getLogger(**name**)

# Set the client window title.

client_window_title = "RuneLite"

def capture_screenshot():
try: # Get the title of the client window.
win = gw.getWindowsWithTitle(client_window_title)[0]
win.activate()
time.sleep(1)

        # Get the client window's position.
        clientWindow = gw.getWindowsWithTitle(client_window_title)[0]
        x1, y1 = clientWindow.topleft
        x2, y2 = clientWindow.bottomright

        # Define the screenshot path and crop area.
        path = "gameWindow.png"
        gui.screenshot(path)
        img = Image.open(path)
        img = img.crop((x1 + 1, y1 + 40, x2 - 250, y2 - 165))
        img.save(path)
        return path

    except Exception as e:
        logger.error(f"An error occurred while capturing screenshot: {e}")
        raise

def encode_image(image_path):
try:
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
except Exception as e:
logger.error(f"An error occurred while encoding image: {e}")
raise

def send_image_to_api(base64_image):
api_key = os.getenv("OPENAI_API_KEY")
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {api_key}"}

    payload = {
        "model": "gpt-4-vision-preview",
        "messages": [
            {"role": "user", "content": [{"type": "text", "text": "What’s in this image?"}, {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}]},
        ],
        "max_tokens": 300,
    }

    try:
        response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
        response.raise_for_status()  # Will raise an exception for HTTP errors.
        return response.json()

    except Exception as e:
        logger.error(f"An error occurred while sending image to API: {e}")
        raise

if **name** == "**main**":
  try: # Perform the main operations.
      screenshot_path = capture_screenshot()
      base64_image = encode_image(screenshot_path)
      api_response = send_image_to_api(base64_image)
      print(api_response)
  except Exception as e:
      logger.error(f"An error occurred in the main function: {e}")
DALL·E 3

Developers can now access DALL·E 3, a multimodal model that generates images from text directly through the API by specifying dall-e-3 as the model.

TTS (Text-to-Speech)

OpenAI's newest model is available to generate human-quality speech from text via their text-to-speech API.

Revenue-Sharing GPT Store: Empowering Creators

The DevDay also cast a spotlight on the newly announced revenue-sharing GPT Store. This platform represents a strategic move towards a more inclusive creator economy within AI, offering compensation to creators of AI applications based on user engagement and usage. This initiative is a nod to the growing importance of content creators in the AI ecosystem and reflects a broader trend of recognizing and rewarding the contributions of individual developers and innovators.

Microsoft Partnership and Azure's Role

The ongoing collaboration with Microsoft was highlighted, with a focus on how Azure's infrastructure is being optimized to support OpenAI's sophisticated AI models. This partnership is a testament to the shared goal of accelerating AI innovation and enhancing integration across various services and platforms as well as Microsoft's heavy investment in AI.

Safe and Gradual AI Integration

OpenAI emphasized a strategic approach to AI integration, advocating for a balance between innovation and safety. The organization invites developers to engage with the new tools thoughtfully, ensuring a responsible progression of AI within different sectors. This measured approach is a reflection of OpenAI's commitment to the safe and ethical development of AI technologies.

Conclusion

The Developer Conference marked a notable milestone for OpenAI and the broader AI community. The launch of GPT4-Turbo and the introduction of new multimodal capabilities, combined with the support of Microsoft's Azure and the innovative revenue-sharing model of the GPT Store, heralds a new phase of growth and experimentation in AI applications.

Available Voices from ElevenLabs API in November 2023

Voice Profiles

ElevenLabs API offers a diverse range of voices, perfect for various use cases like narration and video game character voices. The following voice profiles are available as of November 2023:

Rachel: The Articulate Narrator

Rachel's voice carries a balance of clarity and tranquility, ideal for narration and audiobook projects.

  • Voice ID:
    21m00Tcm4TlvDq8ikWAM
    
  • Accent: American
  • Description: Calm
  • Age: Young
  • Gender: Female
  • Use Case: Narration
  • Preview Voice

Clyde: The Veteran Storyteller

Clyde offers a voice rich with experience, perfect for gritty narratives or character roles that demand a seasoned tone.

  • Voice ID:
    2EiwWnXFnvU5JabPnv8n
    
  • Accent: American
  • Description: War veteran
  • Age: Middle-aged
  • Gender: Male
  • Use Case: Video games
  • Preview Voice

Domi: The Confident Influencer

Domi's commanding voice is filled with confidence, suited for strong narrative leads or powerful corporate presentations.

  • Voice ID:
    AZnzlk1XvdvUeBnXmlld
    
  • Accent: American
  • Description: Strong
  • Age: Young
  • Gender: Female
  • Use Case: Narration
  • Preview Voice

Dave: The Engaging Entertainer

Dave's British-Essex accent adds a unique and engaging flavor, ideal for interactive content and characters with a touch of humor.

  • Voice ID:
    CYw3kZ02Hs0563khs1Fj
    
  • Accent: British-Essex
  • Description: Conversational
  • Age: Young
  • Gender: Male
  • Use Case: Video games
  • Preview Voice

Fin: The Rugged Sea Captain

Fin's voice, with its Irish accent and seasoned timbre, is perfectly suited for characters with depth and a storied past.

  • Voice ID:
    D38z5RcWu1voky8WS1ja
    
  • Accent: Irish
  • Description: Sailor
  • Age: Old
  • Gender: Male
  • Use Case: Video games
  • Preview Voice

OSRS Money Making Guide 2024 - How to Earn a Free RuneScape Bond

Help the OSRS community and earn a bond!

One Small Wiki Favour is a project to improve the Old School RuneScape Wiki, where wiki users can contribute to a number of ongoing wiki tasks and qualify for receiving in-game rewards, including bonds.

Anyone is welcome to participate, whether you're completely new to the wiki or you've been here for a while. If you have not edited before, then this is a great time to get involved!

To participate in this project, join the #oswf-osrs channel in OSRS Wiki's Discord to coordinate with other editors, discuss the project, and claim your rewards.

Rules

  • You must have a wiki account and cannot be blocked. Anonymous edits will not count. You can sign up for a wiki account at Special:CreateAccount.
  • The reward for completion of a task will be one bond or equivalent coins. If multiple editors contribute to a goal then the prize may be split between them.
  • Partial rewards may be awarded for completing part of a task, with a minimum award at 25% of a bond.
  • A task must be completed in its entirety before any bonds or partial bonds are awarded for that task.
  • If you are awarded a bond or a share of a bond, you can get in contact with one of the project runners on Discord, and claim it in-game.
  • You can claim a maximum of two bonds over each two week period that tasks are active.

Automatic notifications

To get automatic notifications of new tasks, you can join the Disord and subscribe to the #oswf-osrs channel.

Alternatively, you can build an IFTTT applet using the RSS feed of the recent changes to the wiki. You can then automatically generate a Notion page with the new tasks and have them sent to you via text message or email.