Use Thread and various performance improvements #5

imshashank · 2024-11-13T06:43:27Z

Summary by CodeRabbit

Release Notes

New Features
- Updated application version to 0.1.8.
- Enhanced HTTP client configuration for improved performance and flexibility.
- Added new configuration parameters for better application tuning.
- Introduced a new GitHub Actions workflow for automated Docker image building and pushing.
- Added a section for "Testing Deployment" in the documentation with a testing gateway URL.
Improvements
- Improved logging for application startup, configuration loading, and request handling.
- Enhanced observability with tracing capabilities for better debugging.
- Optimized handling of HTTP requests and responses with better buffer management.
- Updated Docker build instructions for better clarity and compatibility.
Bug Fixes
- Streamlined error handling during server startup.
Chores
- Updated dependencies to leverage the latest features and optimizations.
- Added a .dockerignore file to streamline the Docker build process.

coderabbitai · 2024-11-13T06:43:35Z

Caution

Review failed

The pull request is closed.

Walkthrough

The pull request introduces multiple changes across several files, primarily focusing on the Cargo.toml configuration, which includes dependency updates, version increments, and new optimization settings. The AppConfig struct in src/config.rs has been enhanced with additional fields and improved logging. The src/handlers.rs file sees updates to logging within the proxy_request and health_check functions. The main application file, src/main.rs, has been modified to improve logging and configuration handling. Additionally, the HTTP client in src/proxy/client.rs is now configured dynamically, and several functions in src/proxy/mod.rs have been updated to utilize the new configuration structure. A new GitHub Actions workflow for Docker builds has also been added.

Changes

File	Change Summary
`Cargo.toml`	- Version updated to `0.1.8`. - Added `.env` to exclude list. - New optimization settings in `[profile.release]`. - Multiple dependency updates and additions.
`src/config.rs`	- Added fields: `worker_threads`, `max_connections`, `tcp_keepalive_interval`, `tcp_nodelay`, `buffer_size` to `AppConfig`. - Updated `new` method for configuration loading.
`src/handlers.rs`	- Updated `health_check` with debug logging. - Enhanced `proxy_request` with `ConnectInfo` extractor and improved logging details.
`src/main.rs`	- Enhanced logging during tracing initialization, configuration loading, and server setup. - Added `async fn shutdown_signal()` for graceful shutdown handling.
`src/proxy/client.rs`	- Added `create_client` function for dynamic HTTP client configuration. - Updated `CLIENT` variable initialization logic.
`src/proxy/mod.rs`	- Updated `proxy_request_to_provider`, `send_provider_request`, and `process_response` functions to accept `config` parameter and optimized logging and header handling.
`.github/workflows/docker-build.yml`	- New workflow added for building and pushing Docker images with conditional steps based on branch.
`.dockerignore`	- New file added to specify files and directories to ignore during Docker builds.
`Dockerfile`	- Updated base images for compatibility with `linux/amd64`. - Improved build process by isolating dependency compilation. - Added labels for better image documentation.
`README.md`	- New section "Testing Deployment" added with URL and example request. - Updated Docker build instructions to use `docker buildx`. - Modified Docker Compose examples for architecture compatibility.
`k8/ai-gateway.yaml`	- Introduced Kubernetes configuration for `magicapi-gateway`, including Ingress, Service, Deployment, and Horizontal Pod Autoscaler.

Possibly related PRs

Use Thread and various performance improvements #5: The changes in this PR directly relate to the updates made in the Cargo.toml file, including the version number change to 0.1.8, the addition of the .env entry to the exclude list, and the introduction of new optimization settings in the [profile.release] section.

🐇 In the code we hop and play,
With new features bright as day.
Dependencies updated, logging clear,
Our magic gateway draws you near!
So let’s celebrate with a joyful cheer,
For every change brings us more cheer! 🥳

Warning

Rate limit exceeded

@imshashank has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between b811656 and e2477cc.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (18)

src/handlers.rs (2)
14-15: Consider enhancing the health check log message

While the debug logging is a good addition, consider including the version number in the log message for better observability.
-    debug!("Health check endpoint called");
+    debug!("Health check endpoint called, version: {}", env!("CARGO_PKG_VERSION"));
36-54: Consider span field optimization and async performance

The tracing implementation is good, but consider these optimizations:

Cache the string conversions for method and path before creating the span

Consider using instrument(span) at the function level instead of wrapping the entire implementation

This could reduce allocations and improve performance for high-throughput scenarios.
pub async fn proxy_request(
    State(config): State<Arc<AppConfig>>,
    headers: HeaderMap,
    ConnectInfo(addr): ConnectInfo<SocketAddr>,
    request: Request<Body>,
-) -> impl IntoResponse {
+) -> impl IntoResponse {
+    let method = request.method().to_string();
+    let path = request.uri().path().to_string();
+    let span = tracing::info_span!(
+        "proxy_request",
+        provider = provider,
+        method = %method,
+        path = %path,
+        client = %addr
+    );
+    proxy_request_inner(config, headers, addr, request).instrument(span).await
+}
+
+async fn proxy_request_inner(
+    State(config): State<Arc<AppConfig>>,
+    headers: HeaderMap,
+    addr: SocketAddr,
+    request: Request<Body>,
+) -> impl IntoResponse {
    // ... rest of the implementation
src/main.rs (3)
21-27: Consider making the logging format configurable

While the compact format is good for production, it might hide useful debugging information during development. Consider making the logging format configurable through the AppConfig.
-        .with(tracing_subscriber::fmt::layer().compact())
+        .with(
+            tracing_subscriber::fmt::layer()
+                .with_target(config.debug_mode)
+                .with_file(config.debug_mode)
+                .with_line_number(config.debug_mode)
+                .compact()
+        )
35-38: Consider making thread stack size configurable

While optimizing the thread stack size is good, the hard-coded value of 2MB might not be optimal for all workloads. Consider making it configurable through AppConfig with 2MB as the default.
-    std::env::set_var("TOKIO_THREAD_STACK_SIZE", (2 * 1024 * 1024).to_string());
+    std::env::set_var("TOKIO_THREAD_STACK_SIZE", config.thread_stack_size.to_string());
48-54: Consider adding rate limiting and request size limits

While the router setup is good, consider adding the following middleware for better protection:

Rate limiting to prevent DoS

Request size limits to prevent memory exhaustion

Timeout middleware for long-running requests
     let app = Router::new()
         .route("/health", get(handlers::health_check))
         .route("/v1/*path", any(handlers::proxy_request))
         .with_state(config.clone())
+        .layer(tower_http::limit::RequestBodyLimitLayer::new(config.max_request_size))
+        .layer(tower_http::timeout::TimeoutLayer::new(Duration::from_secs(30)))
         .layer(cors)
src/proxy/client.rs (3)
3-5: Organize import statements for better readability

Consider grouping related imports together to enhance readability. You can combine the tracing imports into a single line.

Apply this diff to reorganize the imports:
 use once_cell::sync::Lazy;
 use std::time::Duration;
+use tracing::{debug, info};
-use crate::config::AppConfig;
-use tracing::info;
-use tracing::debug;
+use crate::config::AppConfig;
8-14: Adjust logging levels to prevent sensitive data exposure

While logging is essential for monitoring, be cautious about the verbosity and level of detail, especially in production environments. Configuration details might contain sensitive information.

Consider changing the info level log to debug and ensuring that no sensitive data is logged.

Apply this diff to adjust the logging levels:
-    info!("Creating HTTP client with optimized settings");
-    debug!(
+    debug!("Creating HTTP client with optimized settings");
+    info!(
         "Client config: max_connections={}, keepalive={}s, nodelay={}",
         config.max_connections,
         config.tcp_keepalive_interval,
         config.tcp_nodelay
     );
17-29: Make timeout values configurable

The timeout values and durations are currently hardcoded. For greater flexibility and adaptability to different environments, consider making these values configurable through AppConfig.

Suggestion:

Add new fields to AppConfig for the following configurations:

pool_idle_timeout

http2_keep_alive_timeout

general_timeout

connect_timeout

Then, update the client builder as follows:
.pool_idle_timeout(Duration::from_secs(config.pool_idle_timeout))
.http2_keep_alive_timeout(Duration::from_secs(config.http2_keep_alive_timeout))
.timeout(Duration::from_secs(config.general_timeout))
.connect_timeout(Duration::from_secs(config.connect_timeout))
Ensure to update AppConfig and its initialization to include these new fields.
src/config.rs (5)
9-13: Consider adding documentation comments for new config fields

For better code readability and maintainability, consider adding documentation comments to the new fields in AppConfig to explain their purpose and units (e.g., whether tcp_keepalive_interval is in seconds or milliseconds).

18-19: Load environment variables earlier in the application's startup sequence

Calling dotenv::dotenv().ok(); within AppConfig::new() may be too late if environment variables are needed elsewhere during application startup. Consider loading the environment variables at the very beginning of the application, such as in the main function.

25-29: Clarify the logic for calculating default worker threads

The calculation for default_workers may benefit from clarification or adjustment:

For CPU cores ≤ 4, default_workers = cpu_count * 2

For CPU cores > 4, default_workers = cpu_count + 4

Consider whether this scaling is appropriate for your application's performance requirements. You might also want to explain the rationale behind these specific calculations, possibly through code comments, to make it clear for future maintainers.

38-57: Refactor repeated environment variable parsing into a helper function

The parsing of environment variables follows the same pattern multiple times. Consider creating a helper function to reduce code duplication and improve readability.

Here's an example of how you might refactor:
fn parse_env_var<T: std::str::FromStr>(key: &str, default: T) -> T {
    env::var(key)
        .ok()
        .and_then(|v| v.parse().ok())
        .unwrap_or(default)
}
Then, update the configuration initialization:
worker_threads: parse_env_var("WORKER_THREADS", default_workers),
max_connections: parse_env_var("MAX_CONNECTIONS", 10_000),
tcp_keepalive_interval: parse_env_var("TCP_KEEPALIVE_INTERVAL", 30),
tcp_nodelay: parse_env_var("TCP_NODELAY", true),
buffer_size: parse_env_var("BUFFER_SIZE", 8 * 1024),
60-62: Include all configuration fields in the logging output

Currently, the logging statements do not include tcp_keepalive_interval and tcp_nodelay. For completeness and easier debugging, consider logging all configuration fields.

You might adjust the logging as follows:
info!("Configuration loaded: port={}, host={}", config.port, config.host);
debug!(
    "Advanced settings: workers={}, max_conn={}, buffer_size={}, tcp_keepalive_interval={}, tcp_nodelay={}",
    config.worker_threads,
    config.max_connections,
    config.buffer_size,
    config.tcp_keepalive_interval,
    config.tcp_nodelay
);
src/proxy/mod.rs (5)
86-87: Simplify HTTP method conversion

Since method is already of type http::Method, you can convert it directly to reqwest::Method without converting to a byte array.

Apply this diff to simplify the conversion:
-let method = Method::from_bytes(method.as_str().as_bytes())
-    .map_err(|_| AppError::InvalidMethod)?;
+let method = reqwest::Method::from(method);
92-98: Consider logging headers that fail to parse

In the header processing loop, headers that fail to parse are silently skipped. Consider logging a warning for headers that are not included due to parsing issues to aid debugging.

Apply this diff to add logging:
 for (name, value) in headers.iter() {
     if let (Ok(name_str), Ok(v)) = (
         name.as_str().parse::<reqwest::header::HeaderName>(),
         reqwest::header::HeaderValue::from_bytes(value.as_bytes()),
     ) {
         reqwest_headers.insert(name_str, v);
+    } else {
+        warn!("Failed to parse header: {}", name);
     }
 }
104-107: Avoid unnecessary allocation by using body_bytes directly

body_bytes is already a Bytes object that can be used directly. Using .to_vec() creates an unnecessary copy. Pass body_bytes directly to reqwest.

Apply this diff:
 .headers(reqwest_headers)
-.body(body_bytes.to_vec())
+.body(body_bytes)
 .send()
 .await?;
123-130: Consider logging failed header insertions

When processing response headers, headers that fail to parse are silently ignored. Logging these failures can help identify potential issues.

Apply this diff to add logging:
 for (name, value) in response.headers() {
     if let (Ok(header_name), Ok(v)) = (
         http::HeaderName::from_bytes(name.as_ref()),
         HeaderValue::from_bytes(value.as_bytes()),
     ) {
         response_headers.insert(header_name, v);
+    } else {
+        warn!("Failed to parse response header: {:?}", name);
     }
 }
145-148: Potential unnecessary buffer allocation in streaming

Allocating a new BytesMut buffer for each chunk may not be necessary. Since bytes already contains the data, consider returning it directly to reduce overhead.

Apply this diff:
 let stream = response.bytes_stream().map(move |result| {
     match result {
         Ok(bytes) => {
-            let mut buffer = BytesMut::with_capacity(config.buffer_size);
-            buffer.extend_from_slice(&bytes);
-            Ok(buffer.freeze())
+            Ok(bytes)
         }
         Err(e) => {
             error!("Stream error: {}", e);
             Err(std::io::Error::new(std::io::ErrorKind::Other, e))
         }
     }
 });

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f54584a and c0e7aa6.

📒 Files selected for processing (6)

Cargo.toml (3 hunks)
src/config.rs (1 hunks)
src/handlers.rs (1 hunks)
src/main.rs (2 hunks)
src/proxy/client.rs (1 hunks)
src/proxy/mod.rs (5 hunks)

🔇 Additional comments (24)

src/handlers.rs (2)

4-4: LGTM: Import additions are appropriate

The new imports support the added functionality for client address tracking and improved tracing.

Also applies to: 10-11

21-21: LGTM: Enhanced request logging with client information

The addition of client address tracking and detailed debug logging improves observability without adding significant overhead.

Also applies to: 29-34

Cargo.toml (5)

3-3: LGTM! Version bump and security improvement

The version increment and the addition of .env to the exclude list are appropriate changes. Excluding .env files is a security best practice to prevent accidental exposure of sensitive configuration.

Also applies to: 15-15

32-33: LGTM! Thread-related dependencies are well configured

The combination of:

tokio with rt-multi-thread and parking_lot
num_cpus for thread pool sizing
provides a solid foundation for multi-threaded performance improvements.

Also applies to: 47-47

34-34: LGTM! Performance-focused feature selections

The feature selections are well-aligned with performance goals:

compression-full for optimized response sizes
serde for efficient serialization
io features for optimized async operations

Also applies to: 39-39, 41-41

37-37: Verify reqwest version security status

The reqwest version 0.12.9 is relatively recent. Let's verify its security status.

#!/bin/bash
# Check for security advisories for reqwest
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: RUST, package: "reqwest") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'

# Check latest version
cargo search reqwest | grep "^reqwest ="

24-29: Aggressive optimization settings need documentation

While these optimization settings align with the performance improvement goals, they come with important trade-offs:

panic = "abort" removes panic unwinding, which means no stack traces on crashes
codegen-units = 1 will significantly increase compilation times
lto = "fat" increases memory usage during compilation

Consider:

Adding these trade-offs to your documentation
Creating separate profiles for development and production

src/main.rs (1)

31-33: Ensure sensitive configuration values are not logged

While logging configuration values is helpful for debugging, ensure that sensitive information (if any) is properly redacted before logging.

src/proxy/client.rs (2)

25-25: Ensure rustls is the desired TLS backend

By using .use_rustls_tls(), you are opting for rustls as the TLS backend instead of the default native-tls (which often relies on OpenSSL). While rustls is pure Rust and has security benefits, it may not support some advanced TLS features or older TLS versions required by some servers.

Confirm that all target servers are compatible with rustls.

If compatibility issues arise, consider using the default TLS backend or making the TLS backend configurable.

-    .use_rustls_tls()
+    //.use_rustls_tls() // Use default TLS backend

19-19: 🛠️ Refactor suggestion

Reconsider using .http2_prior_knowledge()

The .http2_prior_knowledge() method forces the client to use HTTP/2 without falling back to HTTP/1.1 if HTTP/2 isn't supported by the server. This could lead to connectivity issues with servers that do not support HTTP/2.

Verify that all target servers support HTTP/2 exclusively.

Run the following script to check server protocol support:

If not all servers support HTTP/2, consider removing .http2_prior_knowledge() to allow protocol negotiation:

-    .http2_prior_knowledge()

This change will enable the client to negotiate the highest HTTP version supported by the server.

src/proxy/mod.rs (14)

2-2: Appropriate import of body::{self, Body}

Importing both the body module and the Body type is necessary for handling request and response bodies effectively.

9-9: Good addition of BytesMut for efficient buffer management

Including use bytes::BytesMut; allows for optimized handling of mutable byte buffers, which is beneficial for streaming responses.

17-19: Enhancement: Passing config to proxy_request_to_provider

By adding config as a parameter, the function can now utilize application-wide settings, improving flexibility and configurability.

28-28: Improved logging for provider instance creation

Adding a debug log statement provides valuable insight during debugging and helps trace the request flow.

31-31: Added debug logging before executing the before_request hook

This log statement aids in monitoring the lifecycle of the request and can help identify issues in the hook execution.

54-60: Passing config to send_provider_request for configuration-driven request handling

By passing config, the send_provider_request function can utilize configuration settings such as buffer sizes and timeout settings, enhancing performance and flexibility.

78-78: Extending send_provider_request to accept config parameter

Including config in the function parameters allows for dynamic configuration within the request sending process, improving adaptability.

80-80: Detailed debug logging before sending the request

The added debug statement logs the HTTP method and URL, providing essential information for tracing and debugging.

90-90: Optimization: Pre-allocating reqwest_headers map

Pre-allocating the header map with a known capacity improves performance by reducing memory reallocations.

109-109: Consistent configuration usage by passing config to process_response

Passing config to process_response allows response processing to utilize configuration settings, ensuring consistency.

112-115: Updating process_response to accept config parameter

Modifying the function signature to include config enables the function to access configuration settings for response handling.

117-117: Detailed debug logging for response processing

Logging the response status aids in monitoring and debugging response handling.

119-121: Pre-allocating response_headers map

Optimizing header map allocation can improve performance by reducing memory allocations.

83-83: ⚠️ Potential issue

Incorrect usage of body::to_bytes function

The body::to_bytes function does not accept a second argument. The correct usage is body::to_bytes(body).await?. Passing usize::MAX is unnecessary and may cause a compilation error.

Apply this diff to fix the issue:

-let body_bytes = body::to_bytes(body, usize::MAX).await?;
+let body_bytes = body::to_bytes(body).await?;

Likely invalid or redundant comment.

src/main.rs

coderabbitai · 2024-11-13T06:51:16Z

src/main.rs

+async fn shutdown_signal() {
+    info!("Registering shutdown signal handler");
+    tokio::signal::ctrl_c()
+        .await
+        .expect("Failed to install CTRL+C signal handler");
+    info!("Shutdown signal received, starting graceful shutdown");


🛠️ Refactor suggestion

Enhance shutdown signal handling

Consider improving the shutdown handler:

Handle SIGTERM in addition to CTRL+C

Add timeout for graceful shutdown

Add cleanup operations before shutdown

async fn shutdown_signal() { info!("Registering shutdown signal handler"); - tokio::signal::ctrl_c() - .await - .expect("Failed to install CTRL+C signal handler"); + let ctrl_c = async { + tokio::signal::ctrl_c() + .await + .expect("Failed to install CTRL+C signal handler") + }; + + let terminate = async { + tokio::signal::unix::signal(tokio::signal::unix::SignalKind::terminate()) + .expect("Failed to install signal handler") + .recv() + .await; + }; + + tokio::select! { + _ = ctrl_c => {}, + _ = terminate => {}, + } info!("Shutdown signal received, starting graceful shutdown"); }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-13T06:51:17Z

src/main.rs

+    // Start server with optimized TCP settings
    let addr = std::net::SocketAddr::from(([0, 0, 0, 0], config.port));
+    info!("Setting up TCP listener with non-blocking mode");
+    let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address");
+    tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking");
+
+    debug!("Converting to tokio TCP listener");
+    let listener = tokio::net::TcpListener::from_std(tcp_listener)
+        .expect("Failed to create Tokio TCP listener");
+
    info!(
-        address = %addr,
-        "Starting server"
+        "AI Gateway listening on {}:{} with {} worker threads",
+        config.host, config.port, config.worker_threads
    );

-    match tokio::net::TcpListener::bind(addr).await {
-        Ok(listener) => {
-            info!("Server successfully bound to address");
-            if let Err(e) = axum::serve(listener, app).await {
-                error!(error = %e, "Server error occurred");
-                std::process::exit(1);
-            }
-        }
-        Err(e) => {
-            error!(error = %e, "Failed to bind server to address");
+    axum::serve(listener, app)
+        .with_graceful_shutdown(shutdown_signal())
+        .await
+        .unwrap_or_else(|e| {
+            error!("Server error: {}", e);
            std::process::exit(1);
-        }
-    }
+        });


🛠️ Refactor suggestion

Consider additional TCP socket optimizations

While setting non-blocking mode is good, consider additional TCP socket optimizations:

Setting SO_REUSEADDR for quick restarts

Configuring TCP keepalive

Setting appropriate TCP buffer sizes

let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address"); tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking"); + let socket = tcp_listener.into_raw_fd(); + setsockopt(socket, ReuseAddr, &true)?; + setsockopt(socket, TcpKeepalive, &Some(Duration::from_secs(60)))?;

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-13T06:51:17Z

src/proxy/client.rs

+pub static CLIENT: Lazy<reqwest::Client> = Lazy::new(|| {
+    let config = AppConfig::new();
+    create_client(&config)


🛠️ Refactor suggestion

⚠️ Potential issue

Avoid re-initializing configuration within static contexts

Creating a new AppConfig inside the static CLIENT may lead to inconsistencies, especially if AppConfig::new() reads from environment variables or configuration files that might change between different parts of the application.

Consider passing a reference to a globally initialized AppConfig when creating the client. For example, if AppConfig is initialized in main, you can structure your code to pass it accordingly.

Suggestion:

Modify your main.rs to initialize AppConfig and pass it to where CLIENT is needed.

Alternatively, avoid having CLIENT as a static variable and create the client instance where needed, passing the appropriate configuration.

// Remove the static CLIENT // pub static CLIENT: Lazy<reqwest::Client> = Lazy::new(|| { // let config = AppConfig::new(); // create_client(&config) // });

Then, ensure that the client is created with the correct configuration and passed to functions or modules that require it.

This approach ensures consistency and allows for better testing and flexibility.

src/proxy/mod.rs

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (10)

src/proxy/client.rs (1)
17-29: Consider making timeout values configurable

The function contains several hardcoded timeout values:

pool_idle_timeout: 30s

http2_keep_alive_timeout: 30s

timeout: 30s

connect_timeout: 10s

Consider moving these to the AppConfig for better flexibility and configuration management.
// In src/config.rs
pub struct AppConfig {
    pub max_connections: usize,
    pub tcp_keepalive_interval: u64,
    pub tcp_nodelay: bool,
+   pub pool_idle_timeout_secs: u64,
+   pub http2_keep_alive_timeout_secs: u64,
+   pub request_timeout_secs: u64,
+   pub connect_timeout_secs: u64,
}
Cargo.toml (1)

24-29: Consider the tradeoffs of aggressive optimizations

While these release profile settings will maximize runtime performance, they come with important tradeoffs:

codegen-units = 1 will significantly increase compile times

panic = "abort" removes panic unwinding, which could affect error recovery

strip = true removes debug symbols, making production issues harder to diagnose

Consider if these tradeoffs align with your operational requirements.
src/config.rs (3)
9-13: Consider adding documentation for the performance parameters

The new configuration fields would benefit from documentation explaining their purpose and impact. For example:

worker_threads: How it affects request handling

max_connections: Impact on system resources

tcp_keepalive_interval: Network connection management

tcp_nodelay: TCP packet behavior (Nagle's algorithm)

buffer_size: Memory usage implications
 pub struct AppConfig {
     pub port: u16,
     pub host: String,
+    /// Number of worker threads for handling requests
+    /// Defaults to CPU cores * 2 for systems with ≤4 cores,
+    /// or CPU cores + 4 for systems with >4 cores
     pub worker_threads: usize,
+    /// Maximum number of concurrent connections
+    /// Default: 10,000
     pub max_connections: usize,
+    /// TCP keepalive interval in seconds
+    /// Default: 30 seconds
     pub tcp_keepalive_interval: u64,
+    /// Enable/disable TCP_NODELAY (Nagle's algorithm)
+    /// Default: true
     pub tcp_nodelay: bool,
+    /// Size of the internal buffer for request/response handling
+    /// Default: 8KB
     pub buffer_size: usize,
 }
42-45: Review the default max_connections value

The default of 10,000 connections might be too high for some systems. Consider:

System resource limitations (file descriptors, memory)

OS-specific limits

Available system resources

Consider implementing a dynamic calculation based on system resources or providing a more conservative default. Also, add validation to ensure the value doesn't exceed system limits.

38-57: Improve error handling for environment variables

The current implementation silently falls back to defaults when parsing fails. Consider logging parsing errors to help with debugging configuration issues.
             worker_threads: env::var("WORKER_THREADS")
                 .ok()
-                .and_then(|v| v.parse().ok())
+                .and_then(|v| match v.parse() {
+                    Ok(val) => Some(val),
+                    Err(e) => {
+                        debug!("Failed to parse WORKER_THREADS: {}", e);
+                        None
+                    }
+                })
                 .unwrap_or(default_workers),
.github/workflows/docker-build.yml (3)
3-14: Consider optimizing workflow triggers

The current configuration triggers builds for all branches. Consider limiting this to specific branches (e.g., main, develop, feature/*) to avoid unnecessary builds and save on GitHub Actions minutes.
 on:
   push:
     branches:
-      - '**'  # Match all branches
+      - 'main'
+      - 'develop'
+      - 'feature/**'
96-96: Add newline at end of file

Add a newline character at the end of the file to comply with POSIX standards.
           platforms: linux/amd64,linux/arm64
+
🧰 Tools

🪛 yamllint

[error] 96-96: no new line character at the end of file

(new-line-at-end-of-file)

1-96: Consider implementing workflow reusability

The workflow is well-structured but could benefit from being split into reusable workflows, especially the Docker build and push logic which is duplicated between GHCR and Docker Hub.

Consider:

Creating a reusable workflow for the Docker build and push logic

Parameterizing the registry configuration

Calling the reusable workflow with different parameters for each registry

🧰 Tools

🪛 actionlint

28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

🪛 yamllint

[error] 96-96: no new line character at the end of file

(new-line-at-end-of-file)
src/main.rs (1)
35-38: Consider making thread settings configurable

While the worker threads are configurable, the thread stack size is hardcoded to 2MB. Consider:

Making stack size configurable via AppConfig

Adding validation for worker_threads value

Adding debug logs for stack size setting
-    std::env::set_var("TOKIO_THREAD_STACK_SIZE", (2 * 1024 * 1024).to_string());
+    debug!("Setting thread stack size to {}MB", config.thread_stack_size);
+    std::env::set_var("TOKIO_THREAD_STACK_SIZE", 
+        (config.thread_stack_size * 1024 * 1024).to_string());
src/proxy/mod.rs (1)
144-147: Consider reusing the BytesMut buffer

Creating a new buffer for each chunk in the stream could be inefficient. Consider reusing the buffer by moving its creation outside the map closure.

Apply this diff:
+        let mut buffer = BytesMut::with_capacity(config.buffer_size);
         let stream = response.bytes_stream().map(move |result| {
             match result {
                 Ok(bytes) => {
-                    let mut buffer = BytesMut::with_capacity(config.buffer_size);
+                    buffer.clear();
                     buffer.extend_from_slice(&bytes);
                     Ok(buffer.freeze())
                 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c0e7aa6 and b811656.

📒 Files selected for processing (7)

.github/workflows/docker-build.yml (1 hunks)
Cargo.toml (3 hunks)
src/config.rs (1 hunks)
src/handlers.rs (1 hunks)
src/main.rs (2 hunks)
src/proxy/client.rs (1 hunks)
src/proxy/mod.rs (4 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/handlers.rs

🧰 Additional context used

🪛 actionlint

.github/workflows/docker-build.yml

28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

🪛 yamllint

.github/workflows/docker-build.yml

[error] 96-96: no new line character at the end of file

(new-line-at-end-of-file)

🔇 Additional comments (18)

src/proxy/client.rs (4)

3-5: LGTM: Clean imports with clear purpose

The new imports are well-organized and necessary for the implemented functionality.

7-14: Well-structured logging implementation

Good use of both info and debug levels for logging client configuration details.

34-36: Previous concern about static AppConfig initialization remains valid

The current implementation still creates a new AppConfig inside the static CLIENT initialization, which could lead to inconsistencies.

19-19: Verify HTTP/2 prior knowledge setting

The http2_prior_knowledge() setting assumes all target endpoints support HTTP/2. This could cause issues if any endpoints only support HTTP/1.1.

✅ Verification successful

HTTP/2 prior knowledge is appropriate for the API endpoints used

The codebase exclusively interacts with modern AI service provider APIs (Anthropic, OpenAI, Groq, Fireworks, Together) that are known to support HTTP/2. The client is configured with other HTTP/2 optimizations like adaptive window and keep-alive settings, indicating a deliberate choice for HTTP/2 performance. Using http2_prior_knowledge() is safe and beneficial in this context.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for endpoint configurations or URL patterns in the codebase
rg -g '!target/' -g '!*.lock' 'https?://' --type rust

Length of output: 506

Script:

#!/bin/bash
# Check if there are any HTTP/2 specific configurations or error handling
rg -g '!target/' -g '!*.lock' 'http2|HTTP/2|h2' --type rust -A 3

# Check proxy client configuration and usage
rg -g '!target/' -g '!*.lock' 'Client::builder|reqwest::Client' --type rust -A 3

Length of output: 1394

Cargo.toml (4)

15-15: LGTM! Good security practice

Adding .env to the exclude list is a good security practice to prevent accidental publication of environment variables.

37-37: Verify rustls-tls compatibility with your deployment environments

Switching to rustls-tls with default-features = false removes native-tls support. While rustls is more memory-safe and often faster, ensure it's compatible with your deployment environments and proxy configurations.

32-47: Performance-focused dependency configuration looks good

The dependency updates align well with the PR's performance objectives:

parking_lot for faster synchronization primitives
compression-full for better response compression
http2 support for improved connection handling
Addition of num_cpus for dynamic thread pool sizing

3-3: Verify version number sequence

The version number jumps from 0.1.6 to 0.1.8, skipping 0.1.7. Please confirm if this is intentional.

✅ Verification successful

Version bump from 0.1.6 to 0.1.8 appears to be a direct update

Based on the git history, this appears to be a direct version bump from 0.1.6 to 0.1.8 along with significant dependency updates and feature enhancements. The version increment is justified by the scope of changes:

Upgraded reqwest from 0.11 to 0.12.9 with additional features
Enhanced multiple dependencies with new features (tokio, tower-http, tracing, etc.)
Switched to rustls-tls and added compression features

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check git tags for version history
git tag | grep -E "v?0\.1\.[0-9]+" | sort -V

Length of output: 46

Script:

#!/bin/bash
# Let's check git commit history for version changes in Cargo.toml
git log -p -- Cargo.toml | grep -B 1 "version = "

Length of output: 1759

src/config.rs (2)

2-4: LGTM: Appropriate imports for the new functionality

The new imports support CPU core detection and enhanced logging capabilities.

60-62: LGTM: Comprehensive logging implementation

The logging implementation provides good visibility into the configuration loading process, including both high-level info and detailed debug information.

.github/workflows/docker-build.yml (3)

18-21: LGTM! Well-scoped permissions

The permissions are correctly minimized to only what's needed: read for contents and write for packages.

36-46: LGTM! Efficient build configuration

The configuration is well-structured with:

Proper cache usage for faster builds
Multi-platform support (amd64, arm64)
Build-only verification for non-main branches

47-77: LGTM! Secure GHCR configuration

The GHCR publishing setup follows security best practices:

Uses GITHUB_TOKEN for authentication
Implements proper tagging strategy
Maintains build cache

src/main.rs (4)

7-7: LGTM: Improved logging setup

Good improvements to the logging configuration:

Using structured logging with compact format
Proper fallback for RUST_LOG environment variable
Clear initialization message

Also applies to: 21-27

48-54: LGTM: Enhanced connection handling

Good improvement using into_make_service_with_connect_info which enables access to client connection information. This will be useful for:

Request logging with client IP
Rate limiting per client
Connection tracking

80-99: LGTM: Well-implemented signal handling

Excellent implementation of the shutdown signal handler:

Handles both CTRL+C and SIGTERM signals
Uses tokio::select! for efficient concurrent signal handling
Clear logging of shutdown stages

31-34: Review logging of configuration values

While debug logging configuration is helpful, ensure no sensitive values (if any exist in AppConfig) are being logged. Consider masking sensitive values or moving this to trace level logging.

✅ Verification successful

Debug logging of configuration is acceptable

After reviewing the AppConfig struct definition, I can confirm that it only contains non-sensitive configuration values like port, host, worker threads, and network-related settings. The current debug logging of port and host is appropriate and doesn't pose any security concerns.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if AppConfig contains any sensitive fields
ast-grep --pattern 'struct AppConfig {
  $$$
  password: $_,
  $$$
}' src/config.rs

ast-grep --pattern 'struct AppConfig {
  $$$
  key: $_,
  $$$
}' src/config.rs

ast-grep --pattern 'struct AppConfig {
  $$$
  secret: $_,
  $$$
}' src/config.rs

Length of output: 250

Script:

#!/bin/bash
# Let's first check if the config.rs file exists and see its content
fd "config.rs" --type f --exec cat {}

# As a backup, let's also check for AppConfig struct definition in any Rust file
rg "struct AppConfig" -A 10

Length of output: 2619

src/proxy/mod.rs (1)

Line range hint 17-187: Overall improvements look good!

The changes show good attention to performance optimization through:

Pre-allocation of header maps
Efficient buffer handling for streaming responses
Improved configuration management
Better logging for observability

The previous header handling issues have been properly addressed. Once the identified issues are fixed, this will be a solid improvement to the codebase.

coderabbitai · 2024-11-13T07:45:49Z

src/proxy/client.rs

        .build()
        .expect("Failed to create HTTP client")
+}


🛠️ Refactor suggestion

Improve error handling for client creation

Using expect() could crash the application if client creation fails. Consider propagating the error to the caller for better error handling.

-pub fn create_client(config: &AppConfig) -> reqwest::Client { +pub fn create_client(config: &AppConfig) -> Result<reqwest::Client, reqwest::Error> { info!("Creating HTTP client with optimized settings"); // ... rest of the configuration ... - .build() - .expect("Failed to create HTTP client") + .build() }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-13T07:45:50Z

src/config.rs

+        // Optimize thread count based on CPU cores
+        let cpu_count = num_cpus::get();
+        debug!("Detected {} CPU cores", cpu_count);
+
+        let default_workers = if cpu_count <= 4 {
+            cpu_count * 2
+        } else {
+            cpu_count + 4
+        };


🛠️ Refactor suggestion

Review worker thread calculation strategy

The current worker thread calculation (cpu_count * 2 for ≤4 cores, cpu_count + 4 for >4 cores) might need adjustment:

For single-core systems, having 2 worker threads might be excessive

For many-core systems (32+), the current formula might create too many threads

Consider implementing a more bounded approach:

- let default_workers = if cpu_count <= 4 { - cpu_count * 2 - } else { - cpu_count + 4 - }; + let default_workers = { + let calculated = if cpu_count <= 4 { + cpu_count * 2 + } else { + cpu_count + 4 + }; + // Ensure reasonable bounds + calculated.clamp(1, 32) + };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Optimize thread count based on CPU cores

let cpu_count = num_cpus::get();

debug!("Detected {} CPU cores", cpu_count);

let default_workers = if cpu_count <= 4 {

cpu_count * 2

} else {

cpu_count + 4

};

// Optimize thread count based on CPU cores

let cpu_count = num_cpus::get();

debug!("Detected {} CPU cores", cpu_count);

let default_workers = {

let calculated = if cpu_count <= 4 {

cpu_count * 2

} else {

cpu_count + 4

};

// Ensure reasonable bounds

calculated.clamp(1, 32)

};

coderabbitai · 2024-11-13T07:45:50Z

src/config.rs

+            buffer_size: env::var("BUFFER_SIZE")
+                .ok()
+                .and_then(|v| v.parse().ok())
+                .unwrap_or(8 * 1024), // 8KB default


⚠️ Potential issue

Validate buffer_size constraints

The buffer size should be validated to ensure it's within reasonable bounds:

Minimum size to handle typical requests

Maximum size to prevent memory issues

buffer_size: env::var("BUFFER_SIZE") .ok() - .and_then(|v| v.parse().ok()) - .unwrap_or(8 * 1024), // 8KB default + .and_then(|v| v.parse().ok()) + .map(|size: usize| size.clamp(4096, 32 * 1024)) // Clamp between 4KB and 32KB + .unwrap_or(8 * 1024), // 8KB default

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

buffer_size: env::var("BUFFER_SIZE")

.ok()

.and_then(|v| v.parse().ok())

.unwrap_or(8 * 1024), // 8KB default

buffer_size: env::var("BUFFER_SIZE")

.ok()

.and_then(|v| v.parse().ok())

.map(|size: usize| size.clamp(4096, 32 * 1024)) // Clamp between 4KB and 32KB

.unwrap_or(8 * 1024), // 8KB default

coderabbitai · 2024-11-13T07:45:50Z

.github/workflows/docker-build.yml

+          tags: |
+            magicapi1/magicapi-ai-gateway:latest
+            magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }}
+          cache-from: type=gha


🛠️ Refactor suggestion

Avoid hardcoding Docker Hub image name

The Docker Hub image name is hardcoded, which could cause issues if the repository name changes. Consider using environment variables or GitHub context.

tags: | - magicapi1/magicapi-ai-gateway:latest - magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }} + ${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:latest + ${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:${{ env.CARGO_VERSION }}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

tags: |

magicapi1/magicapi-ai-gateway:latest

magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }}

cache-from: type=gha

tags: |

${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:latest

${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:${{ env.CARGO_VERSION }}

cache-from: type=gha

coderabbitai · 2024-11-13T07:45:50Z

.github/workflows/docker-build.yml

+      - name: Extract version from Cargo.toml
+        id: version
+        run: |
+          VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml)
+          echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV
+          echo "Version found: ${VERSION}"
+


⚠️ Potential issue

Fix shell script quoting

The version extraction script needs proper quoting to handle potential spaces in paths and prevent word splitting.

run: | - VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml) - echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV + VERSION=$(awk -F '"' '/^version = / {print $2}' "Cargo.toml") + echo "CARGO_VERSION=${VERSION}" >> "$GITHUB_ENV" echo "Version found: ${VERSION}"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Extract version from Cargo.toml

id: version

run: |

VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml)

echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV

echo "Version found: ${VERSION}"

- name: Extract version from Cargo.toml

id: version

run: |

VERSION=$(awk -F '"' '/^version = / {print $2}' "Cargo.toml")

echo "CARGO_VERSION=${VERSION}" >> "$GITHUB_ENV"

echo "Version found: ${VERSION}"

🧰 Tools

🪛 actionlint

28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting

(shellcheck)

coderabbitai · 2024-11-13T07:45:50Z

src/main.rs

+    info!("Setting up TCP listener with non-blocking mode");
+    let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address");
+    tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking");
+
+    debug!("Converting to tokio TCP listener");
+    let listener = tokio::net::TcpListener::from_std(tcp_listener)
+        .expect("Failed to create Tokio TCP listener");
+
    info!(
-        address = %addr,
-        "Starting server"
+        "AI Gateway listening on {}:{} with {} worker threads",
+        config.host, config.port, config.worker_threads
    );

-    match tokio::net::TcpListener::bind(addr).await {
-        Ok(listener) => {
-            info!("Server successfully bound to address");
-            if let Err(e) = axum::serve(listener, app).await {
-                error!(error = %e, "Server error occurred");
-                std::process::exit(1);
-            }
-        }
-        Err(e) => {
-            error!(error = %e, "Failed to bind server to address");
+    axum::serve(listener, app)
+        .with_graceful_shutdown(shutdown_signal())
+        .await
+        .unwrap_or_else(|e| {
+            error!("Server error: {}", e);
            std::process::exit(1);
-        }
+        });


🛠️ Refactor suggestion

Consider enhancing error handling

While the error handling has improved with unwrap_or_else, consider implementing a more graceful shutdown:

Log all error details including context

Attempt to clean up resources

Use a proper error code based on the failure type

- .unwrap_or_else(|e| { - error!("Server error: {}", e); - std::process::exit(1); - }); + .unwrap_or_else(|e| { + error!("Server error: {:#}", e); + error!("Shutting down due to critical error"); + // Attempt cleanup + std::process::exit(match e.kind() { + std::io::ErrorKind::AddrInUse => 2, + std::io::ErrorKind::PermissionDenied => 3, + _ => 1, + }); + });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

info!("Setting up TCP listener with non-blocking mode");

let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address");

tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking");

debug!("Converting to tokio TCP listener");

let listener = tokio::net::TcpListener::from_std(tcp_listener)

.expect("Failed to create Tokio TCP listener");

info!(

address = %addr,

"Starting server"

"AI Gateway listening on {}:{} with {} worker threads",

config.host, config.port, config.worker_threads

);

match tokio::net::TcpListener::bind(addr).await {

Ok(listener) => {

info!("Server successfully bound to address");

if let Err(e) = axum::serve(listener, app).await {

error!(error = %e, "Server error occurred");

std::process::exit(1);

}

}

Err(e) => {

error!(error = %e, "Failed to bind server to address");

axum::serve(listener, app)

.with_graceful_shutdown(shutdown_signal())

.await

.unwrap_or_else(|e| {

error!("Server error: {}", e);

std::process::exit(1);

}

});

info!("Setting up TCP listener with non-blocking mode");

let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address");

tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking");

debug!("Converting to tokio TCP listener");

let listener = tokio::net::TcpListener::from_std(tcp_listener)

.expect("Failed to create Tokio TCP listener");

info!(

"AI Gateway listening on {}:{} with {} worker threads",

config.host, config.port, config.worker_threads

);

axum::serve(listener, app)

.with_graceful_shutdown(shutdown_signal())

.await

.unwrap_or_else(|e| {

error!("Server error: {:#}", e);

error!("Shutting down due to critical error");

// Attempt cleanup

std::process::exit(match e.kind() {

std::io::ErrorKind::AddrInUse => 2,

std::io::ErrorKind::PermissionDenied => 3,

_ => 1,

});

});

coderabbitai · 2024-11-13T07:45:50Z

src/proxy/mod.rs

+        Ok(response_builder
+            .body(Body::from_stream(stream))
+            .unwrap())


⚠️ Potential issue

Replace unwrap() with proper error handling

Using unwrap() could cause panics if the response builder fails. Consider proper error handling using map_err().

Apply this diff for streaming response:

- Ok(response_builder - .body(Body::from_stream(stream)) - .unwrap()) + response_builder + .body(Body::from_stream(stream)) + .map_err(|e| AppError::ResponseBuildError(e.to_string()))

And for regular response:

- Ok(response_builder - .body(Body::from(body)) - .unwrap()) + response_builder + .body(Body::from(body)) + .map_err(|e| AppError::ResponseBuildError(e.to_string()))

Also applies to: 185-187

coderabbitai · 2024-11-13T07:45:50Z

src/proxy/mod.rs

 ) -> Result<Response<Body>, AppError> {
+    debug!("Preparing to send request: {} {}", method, url);
+
    let body_bytes = body::to_bytes(body, usize::MAX).await?;


⚠️ Potential issue

Consider adding a configurable body size limit

Using usize::MAX as the body size limit could lead to memory exhaustion with large payloads. Consider using a configurable limit from the AppConfig.

Apply this diff:

- let body_bytes = body::to_bytes(body, usize::MAX).await?; + let body_bytes = body::to_bytes(body, config.buffer_size).await?;

Committable suggestion skipped: line range outside the PR's diff.

Use Thread and various performance improvements

coderabbitai bot reviewed Nov 13, 2024

View reviewed changes

imshashank force-pushed the perf_improvements branch from c0e7aa6 to 38808de Compare November 13, 2024 07:27

Use Thread and various performance improvements

18fd094

imshashank force-pushed the perf_improvements branch from 38808de to 18fd094 Compare November 13, 2024 07:29

Add DockerBuild

b811656

coderabbitai bot reviewed Nov 13, 2024

View reviewed changes

imshashank added 3 commits November 13, 2024 13:46

Add kubernetes deployment with correct platform

c9c04d8

Add managed deployment details

3a00d78

Fix Git build

e2477cc

imshashank merged commit 1275eea into main Nov 13, 2024
2 checks passed

imshashank added a commit that referenced this pull request Nov 13, 2024

Merge pull request #5 from MagicAPI/perf_improvements

fcbed38

Use Thread and various performance improvements

imshashank added a commit that referenced this pull request Nov 13, 2024

Merge pull request #5 from MagicAPI/perf_improvements

6164acf

Use Thread and various performance improvements

coderabbitai bot mentioned this pull request Nov 19, 2024

Add AWS bedrock Provider #6

Merged

coderabbitai bot mentioned this pull request Dec 3, 2024

Add metrics export #10

Open

imshashank deleted the perf_improvements branch December 16, 2024 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Thread and various performance improvements #5

Use Thread and various performance improvements #5

imshashank commented Nov 13, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 13, 2024 •

edited

Loading

Review failed

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

coderabbitai bot Nov 13, 2024

Use Thread and various performance improvements #5

Use Thread and various performance improvements #5

Conversation

imshashank commented Nov 13, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 13, 2024 • edited Loading

Review failed

Walkthrough

Changes

Possibly related PRs

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

imshashank commented Nov 13, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 13, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)