Blog

January 23, 2025
in Robotics
3 min read

Setting Up and Running D500 LiDAR Kit's STL-19P on ROS 2 Jazzy

This guide walks you through setting up the D500 LiDAR Kit's STL-19P sensor for ROS 2 Jazzy, using the ldrobotSensorTeam/ldlidar_ros2 repository. By the end of this article, you'll be able to configure, launch, and visualize LIDAR data in ROS 2.

Prerequisites

Before proceeding, ensure you have the following set up:

ROS 2 Jazzy Installed: Follow the official instructions to install ROS 2 Jazzy.

Set Up Your ROS 2 Workspace: Create a workspace if you don't already have one:

mkdir -p ~/Desktop/frata_workspace/src
cd ~/Desktop/frata_workspace
colcon build
source install/setup.bash

Cloning and Building the LDLiDAR Package

Clone the Repository:

cd ~/Desktop/frata_workspace/src
git clone https://github.com/ldrobotSensorTeam/ldlidar_ros2.git

Install Dependencies: Use rosdep to install any missing dependencies:

cd ~/Desktop/frata_workspace
rosdep install --from-paths src --ignore-src -r -y

Build the Workspace: Compile the package:

colcon build --symlink-install --cmake-args=-DCMAKE_BUILD_TYPE=Release

Source the Workspace: Add the following to your ~/.bashrc and source it:

echo "source ~/Desktop/frata_workspace/install/local_setup.bash" >> ~/.bashrc
source ~/.bashrc

Running the LDLiDAR Node

Connect the LIDAR to a USB Port:
Ensure the LIDAR is connected to your machine. If the device isn't detected, try using a USB extension cable.
Identify the Serial Port: Check for the device's serial port:
```
ls /dev/ttyUSB*
```
Example output: /dev/ttyUSB0.
Launch the Node: Start the LDLiDAR node with the appropriate launch file:
```
ros2 launch ldlidar_ros2 ld19.launch.py
```
If required, modify the port_name in the ld19.launch.py file to match your detected port (e.g., /dev/ttyUSB0).
View LIDAR Data:
Open Rviz2 to visualize the LIDAR data:
```
rviz2
```
Add a "LaserScan" display and set the topic to /scan.

Troubleshooting Common Errors

1. "Communication Abnormal" Error

If you encounter this error:

[ERROR] [ldlidar_publisher_ld19]: ldlidar communication is abnormal.

Check Serial Port: Ensure the correct serial port (/dev/ttyUSB0) is specified in the launch file.
Verify Baud Rate: Confirm that the baud rate in the launch file matches the LIDAR's configuration (default is 230400).
Reconnect the Device: Use a USB extension cable if the device isn't recognized properly.

2. Device Not Found

Run:
```
ls /dev/ttyUSB*
```
If no device appears, ensure the LIDAR is securely connected and powered.

3. No Data in Rviz2

Verify the /scan topic is being published:
```
ros2 topic list
ros2 topic echo /scan
```

4. "Failed init_port fastrtps_port7000" Error

This is a common shared memory transport error in ROS 2. - Solution: Add the following to your .bashrc to disable shared memory transport:

export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp

Example Launch Output

Once everything is set up correctly, you should see the following output:

[INFO] [ldlidar_publisher_ld19]: LDLiDAR SDK Pack Version is:3.3.1
[INFO] [ldlidar_publisher_ld19]: ROS2 param input:
[INFO] [ldlidar_publisher_ld19]: ldlidar serial connect is success
[INFO] [ldlidar_publisher_ld19]: ldlidar communication is normal.
[INFO] [ldlidar_publisher_ld19]: ldlidar driver start is success.
[INFO] [ldlidar_publisher_ld19]: start normal, pub lidar data

Conclusion

With this guide, you can successfully set up and run the D500 LiDAR Kit's STL-19P on ROS 2 Jazzy. If you encounter the "communication abnormal" or other errors, refer to the troubleshooting section to resolve them quickly. This setup enables seamless LIDAR integration for your autonomous robotics projects.

For more information, visit the ldrobotSensorTeam GitHub repository.

January 23, 2025
in Robotics
2 min read

Setting Up the `roboclaw_ros` Node with ROS Noetic in Docker

Introduction

In this guide, we’ll walk through setting up the roboclaw_ros node in ROS Noetic using Docker. This approach ensures a clean, consistent environment for development and deployment while leveraging Docker's flexibility. We'll cover everything from creating the Docker image to running the node with the correct configurations.

Prerequisites

Before proceeding, ensure you have:

Docker Installed: Docker must be installed and operational on your system.
Hardware Setup:
A Roboclaw motor controller connected to /dev/ttyACM0.
Encoders configured for your robot’s specific dimensions.
Dependencies Installed:
The roboclaw_driver library for ROS.
Python libraries like pyserial for communication.

Setup Steps

1. Pull the Base Docker Image

Start by pulling the base Docker image for ROS Noetic:

sudo docker pull arm64v8/ros:noetic-ros-base

2. Run a Container and Install ROS Noetic Components

Launch a container from the base image:

sudo docker run -it --name ros_noetic_container --rm arm64v8/ros:noetic-ros-base

Inside the container, update and install required packages:

apt update
apt install -y ros-noetic-rosbridge-server ros-noetic-tf python3-serial python3-pip
pip3 install diagnostic-updater

3. Create and Mount a Workspace

To persist your workspace across sessions, create a workspace on your host machine and mount it in the container:

mkdir -p ~/ros_noetic_ws/src
sudo docker run -it --name ros_noetic_container --rm -v ~/ros_noetic_ws:/root/ros_noetic_ws arm64v8/ros:noetic-ros-base

Inside the container, initialize the workspace:

cd /root/ros_noetic_ws
catkin_make

4. Clone the `roboclaw_ros` Repository

Clone the repository into the workspace:

cd /root/ros_noetic_ws/src
git clone https://github.com/DoanNguyenTrong/roboclaw_ros.git

5. Build the Workspace

Return to the root of the workspace and build it:

cd /root/ros_noetic_ws
catkin_make
source devel/setup.bash

6. Save the Docker Image

To save your container for future use, commit it as a new image:

sudo docker commit ros_noetic_container ros_noetic_saved

Run this saved image with automatic restart enabled:

sudo docker run -it --name ros_noetic_container --restart always ros_noetic_saved

Running the `roboclaw_ros` Node

To run the roboclaw_ros node, use the following steps:

1. Start the Container

Start the container with the saved image:

sudo docker start -ai ros_noetic_container

2. Launch the Node

Inside the container, run the launch file:

roslaunch roboclaw_node roboclaw.launch

Testing the Node

To verify the node's functionality:

Publish Commands to /cmd_vel:

rostopic pub /cmd_vel geometry_msgs/Twist '{linear: {x: 0.5}, angular: {z: 0.1}}'

Monitor Output:

Check odometry data on /odom:

rostopic echo /odom

Viewing Docker Folder Structure

To view the folder structure inside the container, use:

sudo docker exec -it ros_noetic_container bash
cd /root/ros_noetic_ws
tree

Conclusion

This guide provides a straightforward approach to setting up and running the roboclaw_ros node in a ROS Noetic Docker environment. Docker ensures consistency and portability, making it an ideal choice for robotics development. By following these steps, you can integrate Roboclaw into your robotic system efficiently.

January 22, 2025
in Robotics
2 min read

Setting Up and Running a Roboclaw-Based ROS Node with Obstacle Avoidance

This guide walks you through setting up and running a Roboclaw-based ROS node with obstacle avoidance functionality using LIDAR sensors. Follow these steps to configure and launch your robotics system efficiently.

Prerequisites

Before starting, ensure the following are set up on your system:

ROS Installed: Ensure you have ROS Melodic or a compatible version installed on your system.
Workspace Prepared: Your ROS workspace (e.g., ~/armpi_pro) is built and sourced.
Packages Installed:
roboclaw_ros for motor control.
ldlidar_stl_ros for LIDAR sensor integration.
Hardware Connections:
Roboclaw is connected via /dev/ttyACM0.
LIDAR sensor is operational and connected (e.g., /dev/ttyUSB0).

Launching the Required Nodes

To operate the system, you need to launch three components in sequence:

1. Start the ROS Core

Open a terminal and launch the ROS core:

roscore

Keep this terminal open as it provides the foundation for all ROS nodes.

2. Launch the Roboclaw Node

In a new terminal, navigate to your workspace and launch the Roboclaw node:

roslaunch roboclaw_node roboclaw.launch

This node handles motor control, publishing odometry, and subscribing to velocity commands (/cmd_vel).

3. Launch the LIDAR Node

In another terminal, launch the LIDAR node:

roslaunch ldlidar_stl_ros ld19.launch

This node processes LIDAR data and publishes it to the /scan topic.

Running the Obstacle Avoidance Node

Once the Roboclaw and LIDAR nodes are running, you can start the obstacle avoidance script. This script subscribes to /scan for LIDAR data and publishes velocity commands to /cmd_vel.

Ensure the script is located at:

~/armpi_pro/src/roboclaw_ros/roboclaw_node/scripts/obstacle_avoidance.py

Make the script executable:

chmod +x ~/armpi_pro/src/roboclaw_ros/roboclaw_node/scripts/obstacle_avoidance.py

Run the script using rosrun:

rosrun roboclaw_node obstacle_avoidance.py

How It Works

LIDAR Data Processing: The obstacle avoidance node processes data from /scan. It checks for obstacles within 6 inches (0.15 meters) in front of the robot.
Motor Commands:
If an obstacle is detected, the script sends a stop command to /cmd_vel.
If the path is clear, the script commands the robot to move forward.

Example Workflow

Here’s how you would set up and run the entire system step-by-step:

Open a terminal and start the ROS core:
```
roscore
```
In a second terminal, launch the Roboclaw node:
```
roslaunch roboclaw_node roboclaw.launch
```
In a third terminal, launch the LIDAR node:
```
roslaunch ldlidar_stl_ros ld19.launch
```
Finally, in a fourth terminal, run the obstacle avoidance script:
```
rosrun roboclaw_node obstacle_avoidance.py
```

Troubleshooting

Package Not Found: If you encounter errors like package 'roboclaw_node' not found, rebuild your workspace:
```
cd ~/armpi_pro
catkin_make
source devel/setup.bash
```
LIDAR or Roboclaw Not Responding:
Verify device connections using ls /dev/tty* for correct port names.
Update the respective .launch files to reflect the correct ports.

Script Permissions: Ensure all scripts are executable using:

chmod +x ~/armpi_pro/src/roboclaw_ros/roboclaw_node/scripts/*.py

Conclusion

With this setup, your robot is capable of autonomously navigating forward and stopping when obstacles are detected. The modular structure allows for easy debugging and future enhancements, such as adding new sensors or navigation strategies.

Mastering these steps ensures a reliable and robust robotic system ready for real-world applications.

January 22, 2025
in Robotics
5 min read

Setting Up Hector SLAM with ROS 1 for a Roboclaw-Based Rover

This guide details the steps to configure and use Hector SLAM with a Roboclaw-based ROS 1 robot. We’ll cover LIDAR integration, transform configuration, and how to set RViz for successful mapping and navigation.

Prerequisites

Ensure the following are set up on your system:

ROS Installed: ROS Melodic or a compatible version is installed.
Workspace Prepared: A catkin workspace (e.g., ~/armpi_pro) is created, built, and sourced.
Hector SLAM Package:
Installed and located in your workspace under src/hector_slam.
Ensure hector_mapping and hector_slam_launch are available.
LIDAR Sensor:
Configured and publishing scan data to /scan.
Roboclaw Node:
Operational, publishing odometry to /odom.

Steps to Set Up Hector SLAM

1. Launch ROS Core

Open a terminal and start the ROS core:

roscore

This is required for all other nodes to communicate.

2. Launch the LIDAR and RoboClaw Nodes

To enable LIDAR data for mapping, launch the LIDAR node in a new terminal:

roslaunch ldlidar_stl_ros ld19.launch

This ensures the LIDAR sensor is active and publishing data to the /scan topic, which is necessary for Hector SLAM to create a map.

3. Configure and Launch Hector Mapping

In your workspace, navigate to the Hector SLAM launch folder:

cd ~/armpi_pro/src/hector_slam/hector_slam_launch/launch

Use the tutorial.launch file as a base for running Hector SLAM. Modify the following parameters in the launch file to match your setup:

<launch>
    <!-- Hector Mapping Node -->
    <node pkg="hector_mapping" type="hector_mapping" name="hector_mapping" output="screen">
        <param name="pub_map_odom_transform" value="true"/>
        <param name="map_frame" value="map"/>
        <param name="base_frame" value="base_link"/>
        <param name="odom_frame" value="odom"/>
        <remap from="scan" to="/scan"/>  <!-- Your LIDAR topic -->
        <remap from="odom" to="/odom"/>  <!-- Your odometry topic -->
    </node>

    <!-- Static Transform Publisher -->
    <node pkg="tf2_ros" type="static_transform_publisher" name="laser_to_base_link"
          args="0 0 0.2 0 0 0 base_link laser_link"/>

    <!-- RViz for Visualization -->
    <node pkg="rviz" type="rviz" name="rviz" args="-d $(find hector_slam_launch)/rviz_cfg/mapping.rviz" />
</launch>

Run the newly created launch file above using:

roslaunch hector_slam_launch mapping_with_odom.launch

4. Adjust RViz Configuration

To visualize the mapping process:

Start RViz:
```
rosrun rviz rviz
```
Set the Fixed Frame to scanmatcher_frame:
Navigate to Global Options in RViz.
Set Fixed Frame to scanmatcher_frame.
Add relevant displays:
Map: Subscribe to /map for live updates.
LaserScan: Subscribe to /scan to visualize LIDAR data.
TF: Check the frame hierarchy and transformations.

5. Launch the Roboclaw Node

Start the Roboclaw node to handle motor control and odometry:

roslaunch roboclaw_node roboclaw.launch

This node should publish odometry data to the /odom topic, which Hector SLAM uses.

6. Run the rover

Now that the mapping and motor control nodes are active:

Drive the rover manually or programmatically using the following script:

    #!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Example: Use Pygame to read an Xbox controller and RoboClaw on ROS Melodic (Python 2).
No f-strings; uses .format().

Make sure:
  1) roscore is running (for rospy logs).
  2) SDL_VIDEODRIVER=dummy is set if headless: export SDL_VIDEODRIVER=dummy
  3) RoboClaw is on /dev/ttyACM0 (or change param).
"""

import sys
import rospy
import pygame
import time

# For TF in Melodic (Python 2)
import tf
import tf.transformations as tft

# RoboClaw library (python2-compatible)
from roboclaw import Roboclaw

class XboxRoboclawDebugPy2(object):
    def __init__(self):
        # 1) ROS init
        rospy.init_node("xbox_teleop_debug_py2", anonymous=True)
        rospy.loginfo("Starting xbox_teleop_debug_py2 node...")

        # 2) RoboClaw Setup
        self.port_name = rospy.get_param("~port", "/dev/ttyACM0")
        self.baud_rate = rospy.get_param("~baud", 115200)
        self.address   = rospy.get_param("~address", 0x80)  # decimal 128

        rospy.loginfo("Attempting to open RoboClaw on {0} at {1} baud, address=0x{2:X}".format(
            self.port_name, self.baud_rate, self.address))
        self.rc = Roboclaw(self.port_name, self.baud_rate)
        if not self.rc.Open():
            rospy.logerr("Failed to open RoboClaw on {0}. Exiting.".format(self.port_name))
            sys.exit(1)
        rospy.loginfo("RoboClaw opened successfully on {0}.".format(self.port_name))

        # 3) Initialize Pygame
        pygame.init()
        pygame.joystick.init()

        joystick_count = pygame.joystick.get_count()
        rospy.loginfo("Detected {0} joystick(s).".format(joystick_count))
        if joystick_count == 0:
            rospy.logerr("No joystick detected. Exiting.")
            sys.exit(1)

        self.joystick = pygame.joystick.Joystick(0)
        self.joystick.init()
        rospy.loginfo("Joystick: {0}".format(self.joystick.get_name()))

        # 4) Speed config
        self.max_speed_cmd = 15.0  # if 127 is full speed, 20 is slower
        self.deadzone = 0.1
        rospy.loginfo("max_speed_cmd={0}, deadzone={1}".format(self.max_speed_cmd, self.deadzone))

        # Rate
        self.loop_rate = rospy.Rate(20)  # 20 Hz

        # On shutdown => stop motors
        rospy.on_shutdown(self.stop_motors)

        rospy.loginfo("Initialization complete. Ready to run...")

    def stop_motors(self):
        rospy.loginfo("Stopping motors.")
        self.rc.ForwardM1(self.address, 0)
        self.rc.ForwardM2(self.address, 0)

    def run(self):
        loop_counter = 0
        while not rospy.is_shutdown():
            loop_counter += 1
            # Just for debugging
            rospy.logdebug("Main loop iteration={0}".format(loop_counter))

            pygame.event.pump()

            # Example: left stick vertical = axis 1, right stick horizontal = axis 3
            forward_axis = -self.joystick.get_axis(1)
            turn_axis    =  self.joystick.get_axis(3)

            # Debug prints
            rospy.logdebug("Raw axes => forward_axis={0:.2f}, turn_axis={1:.2f}".format(
                forward_axis, turn_axis))

            # Deadzone
            if abs(forward_axis) < self.deadzone:
                forward_axis = 0.0
            if abs(turn_axis) < self.deadzone:
                turn_axis = 0.0

            # Skid-steer mixing
            left_val  = forward_axis - turn_axis
            right_val = forward_axis + turn_axis

            # clamp [-1..+1]
            left_val  = max(-1.0, min(1.0, left_val))
            right_val = max(-1.0, min(1.0, right_val))

            # Convert to RoboClaw speed
            left_speed, left_dir   = self.convert_speed(left_val)
            right_speed, right_dir = self.convert_speed(right_val)

            rospy.logdebug("Mixed => L={0:.2f}, R={1:.2f} => speeds {2} ({3}), {4} ({5})".format(
                left_val, right_val, left_speed, left_dir, right_speed, right_dir))

            # Send to motors
            if left_dir == "forward":
                self.rc.ForwardM1(self.address, left_speed)
            else:
                self.rc.BackwardM1(self.address, left_speed)

            if right_dir == "forward":
                self.rc.ForwardM2(self.address, right_speed)
            else:
                self.rc.BackwardM2(self.address, right_speed)

            self.loop_rate.sleep()

        self.stop_motors()
        pygame.quit()

    def convert_speed(self, val):
        """
        Convert [-1..+1] => (0..max_speed_cmd, direction).
        """
        if val >= 0:
            return (int(val * self.max_speed_cmd), "forward")
        else:
            return (int(abs(val) * self.max_speed_cmd), "backward")

def main():
    try:
        node = XboxRoboclawDebugPy2()
        node.run()
    except rospy.ROSInterruptException:
        pass
    finally:
        rospy.loginfo("Exiting python2 xbox teleop script.")

if __name__ == "__main__":
    main()

Watch the map being built in RViz.
Ensure the rover avoids obstacles and updates the map in real-time.

Key Adjustments Made

Here’s a summary of the adjustments we made to configure Hector SLAM:

Set Fixed Frame in RViz to scanmatcher_frame.
Modified the Hector SLAM launch file to include:
pub_map_odom_transform set to true.
Correct frame names (map, odom, base_link).
Verified LIDAR publishing to /scan using ld19.launch.
Ensured odometry data was published to /odom by the Roboclaw node.

Troubleshooting

Error: Transform Lookup Failed

If you see errors like "Lookup would require extrapolation into the future," check the following: - Ensure the system clocks are synchronized using ntp or chrony. - Verify all topics (/scan, /odom) have valid and synchronized timestamps.

LIDAR Not Publishing

Run:

rostopic echo /scan

If no data appears, ensure the LIDAR launch file is correctly configured and the sensor is operational.

Conclusion

By following these steps, you’ve successfully set up Hector SLAM on a Roboclaw-based ROS 1 rover. With LIDAR and odometry data, your rover can build a 2D map of its environment, enabling autonomous navigation and enhanced spatial awareness.

Feel free to experiment with SLAM parameters and refine your rover’s capabilities for your specific use case.

January 15, 2025
in Programming
2 min read

Bypassing Cloudflare with Selenium and Undetected-Chromedriver

By combining Selenium with undetected-chromedriver (UC), you can overcome common automation challenges like Cloudflare's browser verification. This guide explores practical workflows and techniques to enhance your web automation projects.

Why Use Selenium with Undetected-Chromedriver?

Cloudflare protections are designed to block bots, posing challenges for developers. By using undetected-chromedriver with Selenium, you can:

Bypass Browser Fingerprinting: UC modifies ChromeDriver to avoid detection.
Handle Cloudflare Challenges: Seamlessly bypass "wait while your browser is verified" messages.
Mitigate CAPTCHA Issues: Reduce interruptions caused by automated bot checks.

Detection Challenges in Web Automation

Websites employ multiple strategies to detect and prevent automated interactions:

CAPTCHA Challenges: Validating user authenticity.
Cloudflare Browser Verification: Infinite loading screens or token-based checks.
Bot Detection Mechanisms: Browser fingerprinting, behavioral analytics, and cookie validation.

These barriers often require advanced techniques to maintain automation workflows.

The Solution: Selenium and Undetected-Chromedriver

The undetected-chromedriver library modifies the default ChromeDriver to emulate human-like behavior and evade detection. When integrated with Selenium, it allows:

Seamless CAPTCHA Bypass: Minimize interruptions by automating responses or avoiding challenges.
Cloudflare Token Handling: Automatically manage verification processes.
Cookie Reuse for Session Preservation: Skip repetitive verifications by reusing authenticated cookies.

Implementation Guide: Setting Up Selenium with Undetected-Chromedriver

Step 1: Install Required Libraries

Install Selenium and undetected-chromedriver:

pip install selenium undetected-chromedriver

Step 2: Initialize the Browser Driver

Set up a Selenium session with UC:

import undetected_chromedriver as uc

# Initialize the driver
driver = uc.Chrome()

# Navigate to a website
driver.get("https://example.com")
print("Page Title:", driver.title)

# Quit the driver
driver.quit()

Step 3: Handle CAPTCHA and Cloudflare Challenges

Use UC to bypass passive bot checks.

Extract and reuse cookies to maintain session continuity:

cookies = driver.get_cookies()
driver.add_cookie(cookies)

Advanced Automation Workflow with Cookies

Step 1: Attempt Standard Automation

Use Selenium with UC to navigate and interact with the website.

Step 2: Use Cookies for Session Continuity

Manually authenticate once, extract cookies, and reuse them for automated sessions:

# Save cookies after manual login
cookies = driver.get_cookies()

# Use cookies in future sessions
for cookie in cookies:
    driver.add_cookie(cookie)
driver.refresh()

Step 3: Fall Back to Manual Assistance

Prompt users to resolve CAPTCHA or login challenges in a separate session and capture the cookies for automation.

Proposed Workflow for Automation

Initial Attempt: Start with Selenium and UC for automation.
Fallback to Cookies: Reuse cookies for continuity if CAPTCHA or Cloudflare challenges arise.
Manual Assistance: Open a browser session for user input, capture cookies, and resume automation.

This iterative process ensures maximum efficiency and minimizes disruptions.

Conclusion

Selenium and undetected-chromedriver provide a powerful toolkit for overcoming automation barriers like CAPTCHA and Cloudflare protections. By leveraging cookies and manual fallbacks, you can create robust workflows that streamline automation processes.

Ready to enhance your web automation? Start integrating Selenium with UC today and unlock new possibilities!

References

January 15, 2025
in Programming
2 min read

Setting Up Browser Use with Botright

Automating web browsing while remaining undetected is critical for various use cases like scraping, testing, or automation. This guide shows you how to set up Browser Use with Botright, a robust library for stealth browsing and captcha solving.

Why Use Botright?

Botright enhances stealth browsing by leveraging techniques like fingerprint masking and AI-driven captcha solving. It integrates seamlessly with Playwright, making it perfect for stealthy automation tasks.

Prerequisites

Ensure the following tools and dependencies are installed:

Python 3.11 or Higher: Install from python.org.
pip: Included with Python.
Browser Use: The main framework.
Botright: For stealth features and advanced automation.

Install Required Packages

Run the following commands to set up your environment:

# Clone the repository
git clone https://github.com/browser-use/browser-use
cd browser-use

# Install uv for environment management
pip install uv

# Create a virtual environment
uv venv --python 3.11

# Activate the virtual environment
source .venv/bin/activate # For Linux/macOS
.venv\Scripts\activate    # For Windows

# Install Browser Use with development dependencies
uv pip install . ."[dev]"

# Install Botright
pip install botright
playwright install

Implementation

Follow these steps to integrate Botright with Browser Use:

1. Update `main.py`

Replace the Playwright browser initialization with Botright’s initialization. Here’s the updated main.py:

import asyncio
import botright

async def main():
    # Initialize Botright with enhanced stealth options
    botright_client = await botright.Botright(headless=False)

    # Launch a browser instance
    browser = await botright_client.new_browser()

    # Open a new page
    page = await browser.new_page()

    # Navigate to a website
    await page.goto("https://example.com")

    # Perform browser actions (e.g., take a screenshot)
    await page.screenshot(path="screenshot.png")
    print("Screenshot saved!")

    # Close the browser and Botright client
    await browser.close()
    await botright_client.close()

if __name__ == "__main__":
    asyncio.run(main())

2. Configure Environment Variables

Set up required API keys by copying the example environment file:

cp .env.example .env

Add the following to your .env file:

BOTRIGHT_API_KEY=your_botright_api_key

3. Run the Script

Run your updated main.py script:

python main.py

4. What Happens?

Browser Initialization: Botright launches a Chromium-based browser instance with stealth enhancements.
Website Navigation: The browser navigates to the specified URL (https://example.com).
Screenshot Capture: A screenshot is taken and saved as screenshot.png.

Debugging Tips

Botright Logs: Use print statements or Botright's built-in logging for debugging.
Browser Errors: Verify the installed version of Playwright and Botright.
Environment Issues: Ensure your .env file contains the correct API keys.

Advanced Features

Captcha Solving

To solve captchas, use Botright's built-in methods. Example:

await page.solve_captcha()

Fingerprint Masking

Botright automatically handles fingerprint masking. To customize this, refer to the Botright documentation.

Conclusion

By integrating Botright with Browser Use, you unlock advanced stealth browsing capabilities ideal for sensitive automation tasks. This setup provides a solid foundation for further enhancements, such as integrating AI-based analysis or multi-page navigation workflows.

January 11, 2025
in Cloud Computing
2 min read

AWS Lambda and Blender: Revolutionizing 3D Rendering in the Cloud

One idea that has been on my ideological backburner for several years now is the concept of using AWS Lambda for rendering a three-dimensional STL or other Blender-compatible file for GitHub contributions. Since the inception of this idea, I've significantly refined my understanding of 3D printing and Python scripting, which has allowed me to develop a more robust and scalable solution.

The Concept

The core concept revolves around leveraging AWS Lambda for rendering 3D scenes—a solution tailored for projects requiring high scalability and rapid turnaround times. This technique excels in scenarios involving numerous simpler assets that must be rendered swiftly, effectively harnessing the computational prowess of cloud technology.

The Implementation

The integration of Blender, a popular open-source 3D graphics software, running on AWS Lambda, epitomizes this blend of flexibility and computational efficiency. This approach is ideal for assets that fit within Lambda's constraints, currently supporting up to 6 vCPUs and 10GB of memory. For more demanding rendering needs, options like EC2 instances or AWS Thinkbox Deadline provide enhanced computational capacity, making them suitable for complex tasks.

The Workflow

The workflow for this implementation is straightforward:

Upload the Blender file to an S3 bucket: Begin by uploading the Blender file to an S3 bucket, ensuring it is accessible to the Lambda function.
Invoke the Lambda function: Trigger the Lambda function to render the 3D scene using Blender.
Retrieve the rendered image: Once the rendering is complete, retrieve the rendered image from the S3 bucket.

The Benefits

The benefits of this approach are manifold:

Scalability: AWS Lambda's scalability ensures that rendering tasks can be efficiently distributed across multiple instances, enhancing performance.
Cost-Effectiveness: Pay only for the compute time consumed, making it a cost-effective solution for rendering tasks.
Flexibility: The ability to scale up or down based on project requirements offers unparalleled flexibility.
Efficiency: The seamless integration of Blender with AWS Lambda streamlines the rendering process, enhancing efficiency.

Credits

The inspiration for this approach was drawn from a detailed implementation by Theodo in 2021, showcasing how Blender can be effectively adapted for serverless architecture. This concept offers transformative potential in the 3D rendering landscape, demonstrating how cloud technologies can redefine efficiency and scalability in creative workflows.

Conclusion

The fusion of AWS Lambda and Blender represents a paradigm shift in 3D rendering, offering a potent solution for projects requiring rapid, scalable rendering capabilities. By leveraging the computational prowess of AWS Lambda and the versatility of Blender, developers can unlock new possibilities in the 3D rendering domain, revolutionizing creative workflows and enhancing efficiency.

January 7, 2025
in Machine Learning
2 min read

Fine-Tuning GPT-4o-mini: A Comprehensive Guide

Fine-tuning GPT-4o-mini allows you to create a customized AI model tailored to specific needs, such as generating content or answering domain-specific questions. This guide will walk you through preparing your data and executing the fine-tuning process.

Step 1: Prepare Your Dataset

Dataset Format

Fine-tuning requires a .jsonl dataset where each line is a structured chat interaction. For example:

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"}, {"role": "assistant", "content": "The capital of France is Paris."}]}
{"messages": [{"role": "system", "content": "You are a travel expert."}, {"role": "user", "content": "What are the best places to visit in Europe?"}, {"role": "assistant", "content": "Some of the best places to visit in Europe include Paris, Rome, Barcelona, and Amsterdam."}]}

Automate Dataset Preparation

Use the Text to JSONL Converter available at Streamlit to convert .txt files into .jsonl format. Ensure you have at least 10 samples.

Step 2: Fine-Tune GPT-4o-mini

Required Code for Fine-Tuning

Save your stories.jsonl file and run the following Python script to initiate fine-tuning:

from openai import OpenAI
import openai
import os

# Initialize OpenAI client and set API key
openai.api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI()

# Step 1: Upload the training file
response = client.files.create(
    file=open("stories.jsonl", "rb"),  # Replace with the correct path to your JSONL file
    purpose="fine-tune"
)

# Extract the file ID from the response object
training_file_id = response.id
print(f"File uploaded successfully. File ID: {training_file_id}")

# Step 2: Create a fine-tuning job
fine_tune_response = client.fine_tuning.jobs.create(
    training_file=training_file_id,
    model="gpt-4o-mini-2024-07-18"  # Replace with the desired base model
)

# Output the fine-tuning job details
print("Fine-tuning job created successfully:")
print(fine_tune_response)

Explanation of the Code

Initialize OpenAI Client:
The script imports the openai library and initializes the API using your key stored in the OPENAI_API_KEY environment variable.
Upload Training File:
The script uploads your stories.jsonl file to OpenAI's servers for processing.
Create Fine-Tuning Job:
The uploaded file is referenced to create a fine-tuning job for the gpt-4o-mini-2024-07-18 model. Replace this with the desired base model as needed.
Monitor Job Details:
The script outputs the details of the fine-tuning job, including its status and other metadata.

Best Practices for Fine-Tuning

Quality Dataset: Ensure the dataset is diverse and adheres to the required structure.
System Role Definition: Use clear instructions in the system role to guide the model’s behavior.
Testing and Iteration: Evaluate the fine-tuned model and refine the dataset if necessary.

By using this step-by-step guide and the provided Python script, you can fine-tune the GPT-4o-mini model for your unique use case effectively. Happy fine-tuning!

January 5, 2025
in Programming
3 min read

Setting Up Venom for WhatsApp Translation

Automating WhatsApp messaging can be a powerful tool for customer service, personal projects, or language translation. Using Venom and Google Translate, this guide will show you how to build a script that translates incoming Spanish messages to English and replies in Spanish.

Why Use Venom?

Venom is a robust Node.js library that allows you to interact with WhatsApp Web. It’s perfect for creating bots, automating tasks, or building translation systems like the one we’ll create here.

Prerequisites

Before diving in, ensure you have the following installed:

Node.js: Install from Node.js Official Website.
npm or yarn: Installed alongside Node.js.
Google Translate Library: For text translation.
Venom: For WhatsApp automation.

Install Required Packages

Run the following commands to install the required libraries:

npm install venom-bot translate-google crypto

Implementation

Here’s how to set up and use Venom to translate WhatsApp messages:

1. Initialize the Project

Create a new file named whatsapp_translator.js and start with the following boilerplate:

const venom = require('venom-bot');
const translate = require('translate-google');
const crypto = require('crypto');

2. Set Up Your WhatsApp Contacts

Define your own WhatsApp ID (for self-messages) and the target contact:

const MY_CONTACT_ID = '12345678900@c.us'; // Your number
const TARGET_CONTACT_ID = '01234567890@c.us'; // Target contact's number

3. Implement the Translation Logic

Here’s the full script for translating messages and avoiding duplicates using a hash set:

// Hash sets to prevent duplicate message processing
const processedMessageHashes = new Set();

venom
  .create({
    session: 'my-whatsapp-session',
    multidevice: true,
  })
  .then((client) => start(client))
  .catch((err) => console.error('Error starting Venom:', err));

function start(client) {
  console.log(`Listening for messages between yourself (${MY_CONTACT_ID}) and ${TARGET_CONTACT_ID}.`);

  const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

  // Function to generate a hash for deduplication
  function generateHash(messageBody) {
    return crypto.createHash('sha256').update(messageBody).digest('hex');
  }

  // Periodically check for new messages in the self-chat
  setInterval(async () => {
    try {
      const messages = await client.getAllMessagesInChat(MY_CONTACT_ID, true, true);
      for (const message of messages) {
        processMessage(client, message, generateHash);
      }
    } catch (err) {
      console.error('Error retrieving self-chat messages:', err);
    }
  }, 2000); // Check every 2 seconds

  // Handle incoming messages
  client.onMessage((message) => processMessage(client, message, generateHash));
}

async function processMessage(client, message, generateHash) {
  const messageHash = generateHash(message.body);

  // Skip if the message has already been processed
  if (processedMessageHashes.has(messageHash)) {
    return;
  }

  // Mark the message as processed
  processedMessageHashes.add(messageHash);

  try {
    if (message.from === MY_CONTACT_ID && message.to === MY_CONTACT_ID) {
      console.log('Message is from you (self-chat).');

      // Translate English to Spanish and send to the target contact
      const translatedToSpanish = await translate(message.body, { to: 'es' });
      console.log(`Translated (English → Spanish): ${translatedToSpanish}`);

      await client.sendText(TARGET_CONTACT_ID, translatedToSpanish);
      console.log(`Sent translated message to ${TARGET_CONTACT_ID}: ${translatedToSpanish}`);
    } else if (message.from === TARGET_CONTACT_ID && !message.isGroupMsg) {
      console.log('Message is from the target contact.');

      // Translate Spanish to English and send to the self-chat
      const translatedToEnglish = await translate(message.body, { to: 'en' });
      console.log(`Translated (Spanish → English): ${translatedToEnglish}`);

      const response = `*Translation (Spanish → English):*\nOriginal: ${message.body}\nTranslated: ${translatedToEnglish}`;
      await client.sendText(MY_CONTACT_ID, response);
      console.log(`Posted translation to yourself: ${MY_CONTACT_ID}`);
    }
  } catch (error) {
    console.error('Error processing message:', error);
    // Remove the hash if processing fails
    processedMessageHashes.delete(messageHash);
  }
}

4. Run the Script

Execute the script using Node.js:

node whatsapp_translator.js

5. What Happens?

Messages you send to yourself (in English) are translated to Spanish and sent to the target contact.
Messages from the target contact (in Spanish) are translated to English and sent to your self-chat.

Debugging Tips

Verify Contact IDs: Ensure MY_CONTACT_ID and TARGET_CONTACT_ID are correctly defined.
Check Logs: Use console.log statements to debug the flow of messages.
Dependency Issues: Reinstall packages with npm install if you encounter errors.

Conclusion

This script automates translation for WhatsApp messages, enabling seamless communication across languages. By leveraging Venom and Google Translate, you can extend this setup to support additional languages, integrate with databases, or even build advanced customer service tools. With this foundation, the possibilities are endless!

December 15, 2024
in Programming
3 min read

Building an Agentic Web Scraping Pipeline for Crypto and Meme Coins

How to Build an Agentic Web Scraping Pipeline for Crypto and Meme Coins

Agentic web scraping revolutionizes data collection by leveraging advanced scraping tools and LLM-based reasoning to analyze websites for actionable insights. This guide demonstrates how to build a closed-loop pipeline for analyzing popular crypto and meme coin websites to enhance trading strategies.

Websites to Scrape

The following websites will serve as data inputs for the pipeline:

Movement Market
Facilitates buying and selling meme coins with email and credit card integration.
Raydium
A decentralized exchange for trading tokens and coins.
Jupiter
A platform for seamless token swaps.
Rugcheck
A tool for evaluating meme coins and identifying scams.
Photon Sol
A browser-based solution for trading low-cap coins.
Cielo Finance
Offers a copy-trading platform to follow top-performing wallets.

Step 1: Structuring Data for Public Websites

For effective analysis, raw HTML data from these websites must be structured into human-readable Markdown.

Tool: Firecrawl

Use Firecrawl to scrape and format the websites:

Example: Scraping Movement Market

import requests

FIRECRAWL_API = "https://api.firecrawl.com/v1/scrape"
API_KEY = "your_firecrawl_api_key"

def scrape_with_firecrawl(url):
    headers = {"Authorization": f"Bearer {API_KEY}"}
    data = {"url": url, "output": "markdown"}
    response = requests.post(FIRECRAWL_API, json=data, headers=headers)

    if response.status_code == 200:
        return response.json().get("markdown")
    else:
        print(f"Error: {response.status_code} - {response.text}")
        return None

markdown_data = scrape_with_firecrawl("https://movement.market/")
print(markdown_data)

Repeat the process for all listed websites to create structured Markdown files.

Step 2: Analyze Public Data with Reasoning Agents

Once the data is structured, LLMs can be used to analyze trends, extract features, and provide actionable insights.

Example: Analyzing Data with OpenAI API

import openai

openai.api_key = "your_openai_api_key"

def analyze_markdown(markdown_data):
    response = openai.Completion.create(
        model="text-davinci-003",
        prompt=f"Analyze this Markdown data to identify trading opportunities and community sentiment:\n\n{markdown_data}",
        max_tokens=1000
    )
    return response.choices[0].text.strip()

markdown_example = "# Example Markdown\nThis is an example of markdown content for analysis."
analysis = analyze_markdown(markdown_example)
print(analysis)

Step 3: Scraping Private Data with Web Automation

For websites requiring interaction (e.g., logins or dynamic content), use Python's Playwright library with AgentQL for advanced navigation and data extraction.

Example: Scraping Photon Sol with Playwright and AgentQL

Install Playwright and AgentQL:

pip install playwright
playwright install

Write the Python Script:

from playwright.sync_api import sync_playwright

def scrape_photon_sol():
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()

        # Navigate to Photon Sol
        page.goto("https://photon-sol.tinyastro.io/")

        # Simulate interactions if needed
        page.wait_for_timeout(3000)  # Wait for the page to load completely
        content = page.content()

        print(content)  # Print or save the page content
        browser.close()

scrape_photon_sol()

This approach ensures data can be extracted even from dynamic websites.

Step 4: Automating the Pipeline

Use Python-based automation tools like Apache Airflow to schedule and run the scraping and analysis pipeline.

Example: Airflow Configuration for the Pipeline

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime

def scrape():
    # Add scraping logic for all websites here
    print("Scraping data...")

def analyze():
    # Add analysis logic here
    print("Analyzing data...")

with DAG('crypto_pipeline', start_date=datetime(2024, 11, 25), schedule_interval='@daily') as dag:
    scrape_task = PythonOperator(task_id='scrape', python_callable=scrape)
    analyze_task = PythonOperator(task_id='analyze', python_callable=analyze)

    scrape_task >> analyze_task

Insights from Websites

Here's what you can focus on while analyzing the scraped data:

Movement Market: Review ease of use, transaction speed, and user feedback.
Raydium: Analyze liquidity and trading fees for tokens.
Jupiter: Evaluate swap rates and platform efficiency.
Rugcheck: Identify red flags in meme coin projects to avoid scams.
Photon Sol: Assess platform usability for low-cap token trading.
Cielo Finance: Analyze wallet strategies and portfolio performance.

Step 5: Closing the Loop

To maintain a closed-loop pipeline, configure the workflow to automatically re-scrape websites at regular intervals and update analyses with new data. This ensures decisions are based on the latest information.

Conclusion

By integrating structured scraping, advanced analysis, and automation, this agentic pipeline enables real-time insights into the crypto and meme coin ecosystem. Use the steps outlined above to stay ahead in the volatile world of meme coins while minimizing risks and maximizing returns. 🚀