-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
105 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,74 @@ | ||
General Instructions | ||
|
||
You can move the mouse, click, right-click, type text, send keypresses, scroll, | ||
Utilize these tools to perform actions on the screen and interact with the GUI of any application | ||
When the user wants you to help debug, or work on a visual design by looking at their screen, IDE or browser, call the take_and_resize_screenshot or take_screenshot_and_crop and send the output from the user. | ||
Make sure to take a screenshot before and after every action you take, even mouse movements. | ||
Please use screenshot to check every step of the way. | ||
Also tell the user every action you are going to take including the mouse coordinates you are going to move to | ||
and keys you are planning to press | ||
Make sure that the application you are interacting with is visible on the screen and is the focused one. | ||
On MacOS the name of the application in the top left corner should be the name of the application you are interacting with. | ||
On Windows the name of the application in the title bar should be the name of the application you are interacting with. | ||
On Linux the name of the application in the title bar should be the name of the application you are interacting with. | ||
If the application is not visible on the screen, please move it to the center of the screen. | ||
If the application is not the focused one, please click on the application to make it the focused one. | ||
If the application is not running, please start the application. | ||
On macOs use Spotlight to search for the application and open it. | ||
On Windows use the search bar to search for the application and open it. | ||
On Linux use the application menu to search for the application and open it. | ||
Utilize these tools to perform actions on the screen and interact with the GUI of any application. | ||
When the user wants you to help debug, or work on a visual design by looking at their screen, IDE, or browser, call the take_and_resize_screenshot or take_screenshot_and_crop and send the output to the user. | ||
Ensure to take a screenshot before and after every action, including mouse movements. | ||
|
||
Tool Descriptions | ||
|
||
get_screen_info | ||
# Get Screen Info | ||
Use the `get_screen_info` tool to obtain the current screen's dimensions. | ||
Outputs the width and height as a dictionary. | ||
|
||
move_mouse | ||
# Move Mouse | ||
The `move_mouse` tool moves the cursor to specified (x, y) coordinates. | ||
Ensure the target location is visible and the application is focused. | ||
|
||
click_mouse | ||
# Click Mouse | ||
Employ the `click_mouse` tool to perform a click action at the cursor's current position. | ||
Verify the click target is interactable. | ||
|
||
right_click_mouse | ||
# Right Click Mouse | ||
Utilize `right_click_mouse` for a right-click action at the cursor's position. | ||
Confirm the context menu or action associated with the right-clicking is desired. | ||
|
||
type_text | ||
# Type Text | ||
The `type_text` tool types provided text using the keyboard. | ||
Ensure the input field or application context is correct before execution. | ||
|
||
press | ||
# Press Key | ||
Press a specified key with `press`. | ||
Check the application’s focus and the expected behavior upon pressing the key. | ||
|
||
press_while_holding | ||
# Press While Holding | ||
Use `press_while_holding` to press keys while holding another, like shortcuts. | ||
Ensure all keys are correctly assigned and the application supports this input. | ||
|
||
scroll | ||
# Scroll | ||
Scroll the view with `scroll`, specify direction and scroll magnitude. | ||
Verify where within the application the scroll should occur. | ||
|
||
view_image | ||
# View Image | ||
Open an image with `view_image` to inspect screenshots or images. | ||
|
||
scale_to_resolution | ||
# Scale to Resolution | ||
Transform coordinates with `scale_to_resolution` to adapt to your display setup. | ||
Ensure scaling is correctly computed—essential for responsive UI actions. | ||
|
||
take_and_resize_screenshot | ||
# Take and Resize Screenshot | ||
Capture screen content with `take_and_resize_screenshot` and maintain size constraints. | ||
|
||
take_screenshot_and_crop | ||
# Take Screenshot and Crop | ||
Captures a screen area specified by coordinates, or use a predefined image for cropping. | ||
Check the output's size limits and ensure the area captures desired UI components. | ||
|
||
Execution and Focus Assurance | ||
|
||
Make sure the application you interact with is visible on the screen and is the focused one. | ||
On macOS, the app name in the top left corner should match your target. | ||
On Windows and Linux, the app name in the title bar should match your target. | ||
|
||
If the application is not visible or focused, please take the necessary steps to adjust its position and focus. If the application is not running, initiate it using the appropriate methods (Spotlight on macOS, search bar on Windows, or application menu on Linux). |