Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Dec 30, 2024
1 parent 7f2664d commit d1b988b
Show file tree
Hide file tree
Showing 2 changed files with 179 additions and 71 deletions.
144 changes: 73 additions & 71 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT-4o Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated December 29, 2024.</p>
<p>Tests are run every day at 1am PT. Last updated December 30, 2024.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand All @@ -58,12 +58,12 @@ <h1>How's GPT-4o Doing?</h1>
<div class="feature_header" style="min-height: auto">
<div class="feature_header_text" style="gap: var(--spacing-sizing-4)">
<h2>Response Time</h2>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.78 seconds</b> per request.</p>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.77 seconds</b> per request.</p>
<p class="subtitle">This number only accounts for requests made by this application.</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>3.78 s</p>
<p>3.77 s</p>
</div>
</div>
</div>
Expand Down Expand Up @@ -122,7 +122,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>8</pre>
<pre>9</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -181,61 +181,7 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
<h2>Handwriting OCR</h2>
<p>Can GPT-4V read handwriting?</p>
</div>
<div class="chart">
<div class="chart_box chart_box_red">
<p>Fail</p>
</div>
</div>
</div>
<div class="result_summary">
<div class="summary_row">
<b class="summary_title">Last 7-Day Performance</b>
<div class="summary_squares">

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_red"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>86.0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.01</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
<div class="explainer">
<h3><span class="explainer_icon far fa-microscope"></span>Method</h3>
<pre class="test_method">We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<pre class="prompt">
Read the text in the image. Return only the text, with punctuation.
</pre>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/ocr.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>The words of songs on the album have been echoing in my head all week. “Fades into the grey of my day old tea.”</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
</div>


<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
Expand Down Expand Up @@ -284,7 +230,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>{'x': 0.4, 'y': 0.35, 'width': 0.2, 'height': 0.3}</pre>
<pre>{'x': 0.5, 'y': 0.3, 'width': 0.25, 'height': 0.15}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -324,7 +270,7 @@ <h2>Graph Understanding</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.012</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.011</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand Down Expand Up @@ -413,9 +359,9 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"R": 80,
"G": 0,
"B": 128
"R": 85,
"G": 20,
"B": 160
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -457,7 +403,7 @@ <h2>Annotation Quality Assurance</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.016</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.017</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand All @@ -471,9 +417,11 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/annotationqa.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>To determine the number of missing annotations, I checked the vehicles in the image and compared them to identified bounding boxes. One car on the far right (the white car) is unlabeled in this image.
<pre>To determine if there are missing annotations in this dataset sample, I would need to verify that all visible cars in the image have red bounding boxes. Upon inspection:

### JSON Output:
- There are visible cars not labeled with red bounding boxes, such as the white car on the right side of the road.

Based on this, it seems there are missing annotations. Here's the JSON:

```json
{
Expand Down Expand Up @@ -533,9 +481,9 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/measurement.jpg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>The dimensions of the sticker are approximately 3 inches by 3 inches based on the ruler in the image.
<pre>Based on the image and the ruler for scale, the sticker appears to have a length and width of approximately **3.0 inches**.

Here is a JSON representation:
Here is the corresponding JSON structure:

```json
{
Expand Down Expand Up @@ -610,7 +558,61 @@ <h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
</div>
</div>
</div>


<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
<h2>Handwriting OCR</h2>
<p>Can GPT-4V read handwriting?</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>Pass</p>
</div>
</div>
</div>
<div class="result_summary">
<div class="summary_row">
<b class="summary_title">Last 7-Day Performance</b>
<div class="summary_squares">

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_red"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

<div class="summary_square summary_square_green"></div>

</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>86.0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.01</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
<div class="explainer">
<h3><span class="explainer_icon far fa-microscope"></span>Method</h3>
<pre class="test_method">We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.</pre>
<h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<pre class="prompt">
Read the text in the image. Return only the text, with punctuation.
</pre>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/ocr.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day old tea."</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
</div>

<div class="feature_card">
<div class="feature_header">
<div class="feature_header_text">
Expand Down Expand Up @@ -807,7 +809,7 @@ <h2>Easy Captcha with Persuasion Attack</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>86.0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.007</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.005</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand Down
Loading

0 comments on commit d1b988b

Please sign in to comment.