Skip to content

A C# library combining TypeChat schema-based prompt engineering with Playwright for natural language web automation. Simplify browser interactions using human-like commands.

Notifications You must be signed in to change notification settings

montraydavis/MDLabs.SchematicBrowserNavigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

MDLabs.SchematicBrowserNavigation Documentation

Table of Contents

  1. Introduction
  2. Installation
  3. Getting Started
  4. Core Concepts
  5. API Reference
  6. Schema Definition
  7. Natural Language Commands and Prompt Examples
  8. Configuration
  9. Troubleshooting
  10. Best Practices

1. Introduction

MDLabs.SchematicBrowserNavigation is a C# library that combines TypeChat schema-based prompt engineering with Playwright to enable natural language-driven web automation. It allows automation engineers to control web browsers using AI-powered natural language commands.

2. Installation

Prerequisites

  • .NET 6.0 or later
  • Node.js 14.0 or later (for Playwright)

Steps

  1. Install the NuGet package:

    dotnet add package MDLabs.SchematicBrowserNavigation
    
  2. Install Playwright browsers:

    pwsh bin/Debug/net6.0/playwright.ps1 install
    
  3. Configure OpenAI or Azure OpenAI API key: Edit the appsettings.json file and add your API key in the OpenAI section:

    {
      "OpenAI": {
        "ApiKey": "your-api-key-here"
      }
    }

3. Getting Started

Basic usage example:

using MDLabs.SchematicBrowserNavigation;

var cancellationTokenSource = new CancellationTokenSource();
var app = new BrowserApp();

await app.InitializePlaywright();
await app.EvalInputAsync("Navigate to Home page.", cancellationTokenSource.Token).ConfigureAwait(false);

4. Core Concepts

  • BrowserApp: The main class for interacting with the library
  • UserInterfaceIntent: Defines the structure of automation commands
  • Natural Language Commands: Human-readable instructions for automation tasks

5. API Reference

BrowserApp

  • InitializePlaywright(): Initializes the Playwright browser
  • ProcessInputAsync(string input, CancellationToken cancelToken): Processes a natural language command

UserInterfaceIntent

  • Intent: The type of action to perform (e.g., NavigateToUrl, ClickButton)
  • TargetElement: The UI element to interact with
  • TargetPage: The page to navigate to
  • Inputs: Additional input data for the command

6. Schema Definition

The schema is defined using C# enums and classes:

public enum UserInterfaceIntentType
{
    NavigateToUrl,
    NavigateToPage,
    ClickButton,
    FillInput,
    // ... other intent types
}

public class UserInterfaceIntent
{
    public UserInterfaceIntentType Intent { get; set; }
    public UserInterfaceElement? TargetElement { get; set; }
    public UserInterfacePage? TargetPage { get; set; }
    public string[]? Inputs { get; set; }
}

7. Natural Language Commands and Prompt Examples

Examples of supported commands with their corresponding markdown prompt examples:

  1. NavigateToUrl:

    Navigate to https://www.example.com
  2. NavigateToPage:

    Go to the Home page
  3. ClickButton:

    Click the Login button
  4. FillInput:

    Enter "user@example.com" in the Email input
  5. SelectDropdownOption:

    Select "Manage Profile" from the "User Options" menu
  6. HoverElement:

    Hover over the Products menu
  7. FocusElement:

    Focus on the Search input
  8. BlurElement:

    Blur the current element
  9. AssertElementAttached:

    Verify that the Login button is attached to the page
  10. AssertElementDetached:

    Check if the Loading spinner is detached
  11. AssertElementVisible:

    Ensure the Error message is visible
  12. AssertElementHidden:

    Confirm that the Password field is hidden
  13. WaitForNetwork:

    Wait for the network to be idle
  14. WaitForSelector:

    Wait for "#someId" to appear

8. Configuration

The library uses configuration files for OpenAI and application-specific settings:

OpenAIConfig config = Config.LoadOpenAI();
AppConfig appConfig = Config.LoadAppConfig();

App Configuration

The AppConfig class defines the structure for application-specific settings:

public class AppConfig
{
    public string BaseUrl { get; set; }
    public Dictionary<string, string> Pages { get; set; }
    public Dictionary<string, string> Elements { get; set; }
}

These settings are loaded from the appsettings.json file:

{
  "App": {
    "BaseUrl": "https://sauce-demo.myshopify.com",
    "Pages": {
      "Home": "{base}/collections/all#sauce-show-wish-list",
      "Login": "{base}/account/login",
      "Catalog": "{base}/collections/all",
      "Blog": "{base}/blogs/news",
      "AboutUs": "{base}/pages/about-us",
      "SignUp": "{base}/account/register"
    },
    "Elements": {
      "SignupMenu": "[href='/account/register']",
      "LoginMenu": "[href='/account/login']",
      "LogoMenu": "#logo > a",
      "HomeMenu": "[href='/']",
      "CatalogMenu": "[href='/collections/all']",
      "BlogMenu": "[href='/blog/news']",
      "AboutUsMenu": "[href='/pages/about-us']",
      "EmailInput": "#customer_email",
      "PasswordInput": "#customer_password",
      "LoginButton": ".action_bottom > button[value='Sign In']"
    }
  }
}

Pages Configuration

The Pages dictionary maps page names to their URLs. The {base} placeholder is replaced with the BaseUrl value when constructing the full URL.

Elements Configuration

The Elements dictionary maps element names to their CSS selectors. These selectors are used to locate and interact with specific elements on the web pages.

This configuration allows for easy maintenance and updates of page URLs and element selectors without modifying the core code.

9. Troubleshooting

Common issues and solutions:

  • Playwright initialization fails: Ensure Node.js is installed and Playwright browsers are properly set up
  • Command not recognized: Check if the command matches the defined UserInterfaceIntentType

10. Best Practices

  • Keep natural language commands clear and concise
  • Use try-catch blocks to handle potential errors in command execution
  • Validate inputs and target elements before performing actions

About

A C# library combining TypeChat schema-based prompt engineering with Playwright for natural language web automation. Simplify browser interactions using human-like commands.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages