-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Thread and various performance improvements #5
Conversation
Caution Review failedThe pull request is closed. WalkthroughThe pull request introduces multiple changes across several files, primarily focusing on the Changes
Possibly related PRs
Warning Rate limit exceeded@imshashank has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 58 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (18)
src/handlers.rs (2)
14-15
: Consider enhancing the health check log messageWhile the debug logging is a good addition, consider including the version number in the log message for better observability.
- debug!("Health check endpoint called"); + debug!("Health check endpoint called, version: {}", env!("CARGO_PKG_VERSION"));
36-54
: Consider span field optimization and async performanceThe tracing implementation is good, but consider these optimizations:
- Cache the string conversions for method and path before creating the span
- Consider using
instrument(span)
at the function level instead of wrapping the entire implementationThis could reduce allocations and improve performance for high-throughput scenarios.
pub async fn proxy_request( State(config): State<Arc<AppConfig>>, headers: HeaderMap, ConnectInfo(addr): ConnectInfo<SocketAddr>, request: Request<Body>, -) -> impl IntoResponse { +) -> impl IntoResponse { + let method = request.method().to_string(); + let path = request.uri().path().to_string(); + let span = tracing::info_span!( + "proxy_request", + provider = provider, + method = %method, + path = %path, + client = %addr + ); + proxy_request_inner(config, headers, addr, request).instrument(span).await +} + +async fn proxy_request_inner( + State(config): State<Arc<AppConfig>>, + headers: HeaderMap, + addr: SocketAddr, + request: Request<Body>, +) -> impl IntoResponse { // ... rest of the implementationsrc/main.rs (3)
21-27
: Consider making the logging format configurableWhile the compact format is good for production, it might hide useful debugging information during development. Consider making the logging format configurable through the
AppConfig
.- .with(tracing_subscriber::fmt::layer().compact()) + .with( + tracing_subscriber::fmt::layer() + .with_target(config.debug_mode) + .with_file(config.debug_mode) + .with_line_number(config.debug_mode) + .compact() + )
35-38
: Consider making thread stack size configurableWhile optimizing the thread stack size is good, the hard-coded value of 2MB might not be optimal for all workloads. Consider making it configurable through
AppConfig
with 2MB as the default.- std::env::set_var("TOKIO_THREAD_STACK_SIZE", (2 * 1024 * 1024).to_string()); + std::env::set_var("TOKIO_THREAD_STACK_SIZE", config.thread_stack_size.to_string());
48-54
: Consider adding rate limiting and request size limitsWhile the router setup is good, consider adding the following middleware for better protection:
- Rate limiting to prevent DoS
- Request size limits to prevent memory exhaustion
- Timeout middleware for long-running requests
let app = Router::new() .route("/health", get(handlers::health_check)) .route("/v1/*path", any(handlers::proxy_request)) .with_state(config.clone()) + .layer(tower_http::limit::RequestBodyLimitLayer::new(config.max_request_size)) + .layer(tower_http::timeout::TimeoutLayer::new(Duration::from_secs(30))) .layer(cors)src/proxy/client.rs (3)
3-5
: Organize import statements for better readabilityConsider grouping related imports together to enhance readability. You can combine the
tracing
imports into a single line.Apply this diff to reorganize the imports:
use once_cell::sync::Lazy; use std::time::Duration; +use tracing::{debug, info}; -use crate::config::AppConfig; -use tracing::info; -use tracing::debug; +use crate::config::AppConfig;
8-14
: Adjust logging levels to prevent sensitive data exposureWhile logging is essential for monitoring, be cautious about the verbosity and level of detail, especially in production environments. Configuration details might contain sensitive information.
Consider changing the
info
level log todebug
and ensuring that no sensitive data is logged.Apply this diff to adjust the logging levels:
- info!("Creating HTTP client with optimized settings"); - debug!( + debug!("Creating HTTP client with optimized settings"); + info!( "Client config: max_connections={}, keepalive={}s, nodelay={}", config.max_connections, config.tcp_keepalive_interval, config.tcp_nodelay );
17-29
: Make timeout values configurableThe timeout values and durations are currently hardcoded. For greater flexibility and adaptability to different environments, consider making these values configurable through
AppConfig
.Suggestion:
Add new fields to
AppConfig
for the following configurations:
pool_idle_timeout
http2_keep_alive_timeout
general_timeout
connect_timeout
Then, update the client builder as follows:
.pool_idle_timeout(Duration::from_secs(config.pool_idle_timeout)) .http2_keep_alive_timeout(Duration::from_secs(config.http2_keep_alive_timeout)) .timeout(Duration::from_secs(config.general_timeout)) .connect_timeout(Duration::from_secs(config.connect_timeout))Ensure to update
AppConfig
and its initialization to include these new fields.src/config.rs (5)
9-13
: Consider adding documentation comments for new config fieldsFor better code readability and maintainability, consider adding documentation comments to the new fields in
AppConfig
to explain their purpose and units (e.g., whethertcp_keepalive_interval
is in seconds or milliseconds).
18-19
: Load environment variables earlier in the application's startup sequenceCalling
dotenv::dotenv().ok();
withinAppConfig::new()
may be too late if environment variables are needed elsewhere during application startup. Consider loading the environment variables at the very beginning of the application, such as in themain
function.
25-29
: Clarify the logic for calculating default worker threadsThe calculation for
default_workers
may benefit from clarification or adjustment:
- For CPU cores ≤ 4,
default_workers = cpu_count * 2
- For CPU cores > 4,
default_workers = cpu_count + 4
Consider whether this scaling is appropriate for your application's performance requirements. You might also want to explain the rationale behind these specific calculations, possibly through code comments, to make it clear for future maintainers.
38-57
: Refactor repeated environment variable parsing into a helper functionThe parsing of environment variables follows the same pattern multiple times. Consider creating a helper function to reduce code duplication and improve readability.
Here's an example of how you might refactor:
fn parse_env_var<T: std::str::FromStr>(key: &str, default: T) -> T { env::var(key) .ok() .and_then(|v| v.parse().ok()) .unwrap_or(default) }Then, update the configuration initialization:
worker_threads: parse_env_var("WORKER_THREADS", default_workers), max_connections: parse_env_var("MAX_CONNECTIONS", 10_000), tcp_keepalive_interval: parse_env_var("TCP_KEEPALIVE_INTERVAL", 30), tcp_nodelay: parse_env_var("TCP_NODELAY", true), buffer_size: parse_env_var("BUFFER_SIZE", 8 * 1024),
60-62
: Include all configuration fields in the logging outputCurrently, the logging statements do not include
tcp_keepalive_interval
andtcp_nodelay
. For completeness and easier debugging, consider logging all configuration fields.You might adjust the logging as follows:
info!("Configuration loaded: port={}, host={}", config.port, config.host); debug!( "Advanced settings: workers={}, max_conn={}, buffer_size={}, tcp_keepalive_interval={}, tcp_nodelay={}", config.worker_threads, config.max_connections, config.buffer_size, config.tcp_keepalive_interval, config.tcp_nodelay );src/proxy/mod.rs (5)
86-87
: Simplify HTTP method conversionSince
method
is already of typehttp::Method
, you can convert it directly toreqwest::Method
without converting to a byte array.Apply this diff to simplify the conversion:
-let method = Method::from_bytes(method.as_str().as_bytes()) - .map_err(|_| AppError::InvalidMethod)?; +let method = reqwest::Method::from(method);
92-98
: Consider logging headers that fail to parseIn the header processing loop, headers that fail to parse are silently skipped. Consider logging a warning for headers that are not included due to parsing issues to aid debugging.
Apply this diff to add logging:
for (name, value) in headers.iter() { if let (Ok(name_str), Ok(v)) = ( name.as_str().parse::<reqwest::header::HeaderName>(), reqwest::header::HeaderValue::from_bytes(value.as_bytes()), ) { reqwest_headers.insert(name_str, v); + } else { + warn!("Failed to parse header: {}", name); } }
104-107
: Avoid unnecessary allocation by usingbody_bytes
directly
body_bytes
is already aBytes
object that can be used directly. Using.to_vec()
creates an unnecessary copy. Passbody_bytes
directly toreqwest
.Apply this diff:
.headers(reqwest_headers) -.body(body_bytes.to_vec()) +.body(body_bytes) .send() .await?;
123-130
: Consider logging failed header insertionsWhen processing response headers, headers that fail to parse are silently ignored. Logging these failures can help identify potential issues.
Apply this diff to add logging:
for (name, value) in response.headers() { if let (Ok(header_name), Ok(v)) = ( http::HeaderName::from_bytes(name.as_ref()), HeaderValue::from_bytes(value.as_bytes()), ) { response_headers.insert(header_name, v); + } else { + warn!("Failed to parse response header: {:?}", name); } }
145-148
: Potential unnecessary buffer allocation in streamingAllocating a new
BytesMut
buffer for each chunk may not be necessary. Sincebytes
already contains the data, consider returning it directly to reduce overhead.Apply this diff:
let stream = response.bytes_stream().map(move |result| { match result { Ok(bytes) => { - let mut buffer = BytesMut::with_capacity(config.buffer_size); - buffer.extend_from_slice(&bytes); - Ok(buffer.freeze()) + Ok(bytes) } Err(e) => { error!("Stream error: {}", e); Err(std::io::Error::new(std::io::ErrorKind::Other, e)) } } });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
Cargo.toml
(3 hunks)src/config.rs
(1 hunks)src/handlers.rs
(1 hunks)src/main.rs
(2 hunks)src/proxy/client.rs
(1 hunks)src/proxy/mod.rs
(5 hunks)
🔇 Additional comments (24)
src/handlers.rs (2)
4-4
: LGTM: Import additions are appropriate
The new imports support the added functionality for client address tracking and improved tracing.
Also applies to: 10-11
21-21
: LGTM: Enhanced request logging with client information
The addition of client address tracking and detailed debug logging improves observability without adding significant overhead.
Also applies to: 29-34
Cargo.toml (5)
3-3
: LGTM! Version bump and security improvement
The version increment and the addition of .env
to the exclude list are appropriate changes. Excluding .env
files is a security best practice to prevent accidental exposure of sensitive configuration.
Also applies to: 15-15
32-33
: LGTM! Thread-related dependencies are well configured
The combination of:
- tokio with
rt-multi-thread
andparking_lot
num_cpus
for thread pool sizing
provides a solid foundation for multi-threaded performance improvements.
Also applies to: 47-47
34-34
: LGTM! Performance-focused feature selections
The feature selections are well-aligned with performance goals:
compression-full
for optimized response sizesserde
for efficient serializationio
features for optimized async operations
Also applies to: 39-39, 41-41
37-37
: Verify reqwest version security status
The reqwest version 0.12.9 is relatively recent. Let's verify its security status.
#!/bin/bash
# Check for security advisories for reqwest
gh api graphql -f query='
{
securityVulnerabilities(first: 5, ecosystem: RUST, package: "reqwest") {
nodes {
advisory {
summary
severity
publishedAt
}
vulnerableVersionRange
firstPatchedVersion {
identifier
}
}
}
}'
# Check latest version
cargo search reqwest | grep "^reqwest ="
24-29
: Aggressive optimization settings need documentation
While these optimization settings align with the performance improvement goals, they come with important trade-offs:
panic = "abort"
removes panic unwinding, which means no stack traces on crashescodegen-units = 1
will significantly increase compilation timeslto = "fat"
increases memory usage during compilation
Consider:
- Adding these trade-offs to your documentation
- Creating separate profiles for development and production
src/main.rs (1)
31-33
: Ensure sensitive configuration values are not logged
While logging configuration values is helpful for debugging, ensure that sensitive information (if any) is properly redacted before logging.
src/proxy/client.rs (2)
25-25
: Ensure rustls
is the desired TLS backend
By using .use_rustls_tls()
, you are opting for rustls
as the TLS backend instead of the default native-tls
(which often relies on OpenSSL). While rustls
is pure Rust and has security benefits, it may not support some advanced TLS features or older TLS versions required by some servers.
Confirm that all target servers are compatible with rustls
.
If compatibility issues arise, consider using the default TLS backend or making the TLS backend configurable.
- .use_rustls_tls()
+ //.use_rustls_tls() // Use default TLS backend
19-19
: 🛠️ Refactor suggestion
Reconsider using .http2_prior_knowledge()
The .http2_prior_knowledge()
method forces the client to use HTTP/2 without falling back to HTTP/1.1 if HTTP/2 isn't supported by the server. This could lead to connectivity issues with servers that do not support HTTP/2.
Verify that all target servers support HTTP/2 exclusively.
Run the following script to check server protocol support:
If not all servers support HTTP/2, consider removing .http2_prior_knowledge()
to allow protocol negotiation:
- .http2_prior_knowledge()
This change will enable the client to negotiate the highest HTTP version supported by the server.
src/proxy/mod.rs (14)
2-2
: Appropriate import of body::{self, Body}
Importing both the body
module and the Body
type is necessary for handling request and response bodies effectively.
9-9
: Good addition of BytesMut
for efficient buffer management
Including use bytes::BytesMut;
allows for optimized handling of mutable byte buffers, which is beneficial for streaming responses.
17-19
: Enhancement: Passing config
to proxy_request_to_provider
By adding config
as a parameter, the function can now utilize application-wide settings, improving flexibility and configurability.
28-28
: Improved logging for provider instance creation
Adding a debug log statement provides valuable insight during debugging and helps trace the request flow.
31-31
: Added debug logging before executing the before_request
hook
This log statement aids in monitoring the lifecycle of the request and can help identify issues in the hook execution.
54-60
: Passing config
to send_provider_request
for configuration-driven request handling
By passing config
, the send_provider_request
function can utilize configuration settings such as buffer sizes and timeout settings, enhancing performance and flexibility.
78-78
: Extending send_provider_request
to accept config
parameter
Including config
in the function parameters allows for dynamic configuration within the request sending process, improving adaptability.
80-80
: Detailed debug logging before sending the request
The added debug statement logs the HTTP method and URL, providing essential information for tracing and debugging.
90-90
: Optimization: Pre-allocating reqwest_headers
map
Pre-allocating the header map with a known capacity improves performance by reducing memory reallocations.
109-109
: Consistent configuration usage by passing config
to process_response
Passing config
to process_response
allows response processing to utilize configuration settings, ensuring consistency.
112-115
: Updating process_response
to accept config
parameter
Modifying the function signature to include config
enables the function to access configuration settings for response handling.
117-117
: Detailed debug logging for response processing
Logging the response status aids in monitoring and debugging response handling.
119-121
: Pre-allocating response_headers
map
Optimizing header map allocation can improve performance by reducing memory allocations.
83-83
:
Incorrect usage of body::to_bytes
function
The body::to_bytes
function does not accept a second argument. The correct usage is body::to_bytes(body).await?
. Passing usize::MAX
is unnecessary and may cause a compilation error.
Apply this diff to fix the issue:
-let body_bytes = body::to_bytes(body, usize::MAX).await?;
+let body_bytes = body::to_bytes(body).await?;
Likely invalid or redundant comment.
src/main.rs
Outdated
async fn shutdown_signal() { | ||
info!("Registering shutdown signal handler"); | ||
tokio::signal::ctrl_c() | ||
.await | ||
.expect("Failed to install CTRL+C signal handler"); | ||
info!("Shutdown signal received, starting graceful shutdown"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance shutdown signal handling
Consider improving the shutdown handler:
- Handle SIGTERM in addition to CTRL+C
- Add timeout for graceful shutdown
- Add cleanup operations before shutdown
async fn shutdown_signal() {
info!("Registering shutdown signal handler");
- tokio::signal::ctrl_c()
- .await
- .expect("Failed to install CTRL+C signal handler");
+ let ctrl_c = async {
+ tokio::signal::ctrl_c()
+ .await
+ .expect("Failed to install CTRL+C signal handler")
+ };
+
+ let terminate = async {
+ tokio::signal::unix::signal(tokio::signal::unix::SignalKind::terminate())
+ .expect("Failed to install signal handler")
+ .recv()
+ .await;
+ };
+
+ tokio::select! {
+ _ = ctrl_c => {},
+ _ = terminate => {},
+ }
info!("Shutdown signal received, starting graceful shutdown");
}
Committable suggestion skipped: line range outside the PR's diff.
src/main.rs
Outdated
// Start server with optimized TCP settings | ||
let addr = std::net::SocketAddr::from(([0, 0, 0, 0], config.port)); | ||
info!("Setting up TCP listener with non-blocking mode"); | ||
let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address"); | ||
tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking"); | ||
|
||
debug!("Converting to tokio TCP listener"); | ||
let listener = tokio::net::TcpListener::from_std(tcp_listener) | ||
.expect("Failed to create Tokio TCP listener"); | ||
|
||
info!( | ||
address = %addr, | ||
"Starting server" | ||
"AI Gateway listening on {}:{} with {} worker threads", | ||
config.host, config.port, config.worker_threads | ||
); | ||
|
||
match tokio::net::TcpListener::bind(addr).await { | ||
Ok(listener) => { | ||
info!("Server successfully bound to address"); | ||
if let Err(e) = axum::serve(listener, app).await { | ||
error!(error = %e, "Server error occurred"); | ||
std::process::exit(1); | ||
} | ||
} | ||
Err(e) => { | ||
error!(error = %e, "Failed to bind server to address"); | ||
axum::serve(listener, app) | ||
.with_graceful_shutdown(shutdown_signal()) | ||
.await | ||
.unwrap_or_else(|e| { | ||
error!("Server error: {}", e); | ||
std::process::exit(1); | ||
} | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider additional TCP socket optimizations
While setting non-blocking mode is good, consider additional TCP socket optimizations:
- Setting SO_REUSEADDR for quick restarts
- Configuring TCP keepalive
- Setting appropriate TCP buffer sizes
let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address");
tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking");
+ let socket = tcp_listener.into_raw_fd();
+ setsockopt(socket, ReuseAddr, &true)?;
+ setsockopt(socket, TcpKeepalive, &Some(Duration::from_secs(60)))?;
Committable suggestion skipped: line range outside the PR's diff.
pub static CLIENT: Lazy<reqwest::Client> = Lazy::new(|| { | ||
let config = AppConfig::new(); | ||
create_client(&config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid re-initializing configuration within static contexts
Creating a new AppConfig
inside the static CLIENT
may lead to inconsistencies, especially if AppConfig::new()
reads from environment variables or configuration files that might change between different parts of the application.
Consider passing a reference to a globally initialized AppConfig
when creating the client. For example, if AppConfig
is initialized in main
, you can structure your code to pass it accordingly.
Suggestion:
Modify your main.rs
to initialize AppConfig
and pass it to where CLIENT
is needed.
Alternatively, avoid having CLIENT
as a static variable and create the client instance where needed, passing the appropriate configuration.
// Remove the static CLIENT
// pub static CLIENT: Lazy<reqwest::Client> = Lazy::new(|| {
// let config = AppConfig::new();
// create_client(&config)
// });
Then, ensure that the client is created with the correct configuration and passed to functions or modules that require it.
This approach ensures consistency and allows for better testing and flexibility.
c0e7aa6
to
38808de
Compare
38808de
to
18fd094
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (10)
src/proxy/client.rs (1)
17-29
: Consider making timeout values configurableThe function contains several hardcoded timeout values:
- pool_idle_timeout: 30s
- http2_keep_alive_timeout: 30s
- timeout: 30s
- connect_timeout: 10s
Consider moving these to the AppConfig for better flexibility and configuration management.
// In src/config.rs pub struct AppConfig { pub max_connections: usize, pub tcp_keepalive_interval: u64, pub tcp_nodelay: bool, + pub pool_idle_timeout_secs: u64, + pub http2_keep_alive_timeout_secs: u64, + pub request_timeout_secs: u64, + pub connect_timeout_secs: u64, }Cargo.toml (1)
24-29
: Consider the tradeoffs of aggressive optimizationsWhile these release profile settings will maximize runtime performance, they come with important tradeoffs:
codegen-units = 1
will significantly increase compile timespanic = "abort"
removes panic unwinding, which could affect error recoverystrip = true
removes debug symbols, making production issues harder to diagnoseConsider if these tradeoffs align with your operational requirements.
src/config.rs (3)
9-13
: Consider adding documentation for the performance parametersThe new configuration fields would benefit from documentation explaining their purpose and impact. For example:
worker_threads
: How it affects request handlingmax_connections
: Impact on system resourcestcp_keepalive_interval
: Network connection managementtcp_nodelay
: TCP packet behavior (Nagle's algorithm)buffer_size
: Memory usage implicationspub struct AppConfig { pub port: u16, pub host: String, + /// Number of worker threads for handling requests + /// Defaults to CPU cores * 2 for systems with ≤4 cores, + /// or CPU cores + 4 for systems with >4 cores pub worker_threads: usize, + /// Maximum number of concurrent connections + /// Default: 10,000 pub max_connections: usize, + /// TCP keepalive interval in seconds + /// Default: 30 seconds pub tcp_keepalive_interval: u64, + /// Enable/disable TCP_NODELAY (Nagle's algorithm) + /// Default: true pub tcp_nodelay: bool, + /// Size of the internal buffer for request/response handling + /// Default: 8KB pub buffer_size: usize, }
42-45
: Review the default max_connections valueThe default of 10,000 connections might be too high for some systems. Consider:
- System resource limitations (file descriptors, memory)
- OS-specific limits
- Available system resources
Consider implementing a dynamic calculation based on system resources or providing a more conservative default. Also, add validation to ensure the value doesn't exceed system limits.
38-57
: Improve error handling for environment variablesThe current implementation silently falls back to defaults when parsing fails. Consider logging parsing errors to help with debugging configuration issues.
worker_threads: env::var("WORKER_THREADS") .ok() - .and_then(|v| v.parse().ok()) + .and_then(|v| match v.parse() { + Ok(val) => Some(val), + Err(e) => { + debug!("Failed to parse WORKER_THREADS: {}", e); + None + } + }) .unwrap_or(default_workers),.github/workflows/docker-build.yml (3)
3-14
: Consider optimizing workflow triggersThe current configuration triggers builds for all branches. Consider limiting this to specific branches (e.g., main, develop, feature/*) to avoid unnecessary builds and save on GitHub Actions minutes.
on: push: branches: - - '**' # Match all branches + - 'main' + - 'develop' + - 'feature/**'
96-96
: Add newline at end of fileAdd a newline character at the end of the file to comply with POSIX standards.
platforms: linux/amd64,linux/arm64 +
🧰 Tools
🪛 yamllint
[error] 96-96: no new line character at the end of file
(new-line-at-end-of-file)
1-96
: Consider implementing workflow reusabilityThe workflow is well-structured but could benefit from being split into reusable workflows, especially the Docker build and push logic which is duplicated between GHCR and Docker Hub.
Consider:
- Creating a reusable workflow for the Docker build and push logic
- Parameterizing the registry configuration
- Calling the reusable workflow with different parameters for each registry
🧰 Tools
🪛 actionlint
28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 yamllint
[error] 96-96: no new line character at the end of file
(new-line-at-end-of-file)
src/main.rs (1)
35-38
: Consider making thread settings configurableWhile the worker threads are configurable, the thread stack size is hardcoded to 2MB. Consider:
- Making stack size configurable via AppConfig
- Adding validation for worker_threads value
- Adding debug logs for stack size setting
- std::env::set_var("TOKIO_THREAD_STACK_SIZE", (2 * 1024 * 1024).to_string()); + debug!("Setting thread stack size to {}MB", config.thread_stack_size); + std::env::set_var("TOKIO_THREAD_STACK_SIZE", + (config.thread_stack_size * 1024 * 1024).to_string());src/proxy/mod.rs (1)
144-147
: Consider reusing the BytesMut bufferCreating a new buffer for each chunk in the stream could be inefficient. Consider reusing the buffer by moving its creation outside the map closure.
Apply this diff:
+ let mut buffer = BytesMut::with_capacity(config.buffer_size); let stream = response.bytes_stream().map(move |result| { match result { Ok(bytes) => { - let mut buffer = BytesMut::with_capacity(config.buffer_size); + buffer.clear(); buffer.extend_from_slice(&bytes); Ok(buffer.freeze()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
.github/workflows/docker-build.yml
(1 hunks)Cargo.toml
(3 hunks)src/config.rs
(1 hunks)src/handlers.rs
(1 hunks)src/main.rs
(2 hunks)src/proxy/client.rs
(1 hunks)src/proxy/mod.rs
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/handlers.rs
🧰 Additional context used
🪛 actionlint
.github/workflows/docker-build.yml
28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 yamllint
.github/workflows/docker-build.yml
[error] 96-96: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (18)
src/proxy/client.rs (4)
3-5
: LGTM: Clean imports with clear purpose
The new imports are well-organized and necessary for the implemented functionality.
7-14
: Well-structured logging implementation
Good use of both info and debug levels for logging client configuration details.
34-36
: Previous concern about static AppConfig initialization remains valid
The current implementation still creates a new AppConfig inside the static CLIENT initialization, which could lead to inconsistencies.
19-19
: Verify HTTP/2 prior knowledge setting
The http2_prior_knowledge()
setting assumes all target endpoints support HTTP/2. This could cause issues if any endpoints only support HTTP/1.1.
✅ Verification successful
HTTP/2 prior knowledge is appropriate for the API endpoints used
The codebase exclusively interacts with modern AI service provider APIs (Anthropic, OpenAI, Groq, Fireworks, Together) that are known to support HTTP/2. The client is configured with other HTTP/2 optimizations like adaptive window and keep-alive settings, indicating a deliberate choice for HTTP/2 performance. Using http2_prior_knowledge()
is safe and beneficial in this context.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for endpoint configurations or URL patterns in the codebase
rg -g '!target/' -g '!*.lock' 'https?://' --type rust
Length of output: 506
Script:
#!/bin/bash
# Check if there are any HTTP/2 specific configurations or error handling
rg -g '!target/' -g '!*.lock' 'http2|HTTP/2|h2' --type rust -A 3
# Check proxy client configuration and usage
rg -g '!target/' -g '!*.lock' 'Client::builder|reqwest::Client' --type rust -A 3
Length of output: 1394
Cargo.toml (4)
15-15
: LGTM! Good security practice
Adding .env
to the exclude list is a good security practice to prevent accidental publication of environment variables.
37-37
: Verify rustls-tls compatibility with your deployment environments
Switching to rustls-tls
with default-features = false
removes native-tls support. While rustls is more memory-safe and often faster, ensure it's compatible with your deployment environments and proxy configurations.
32-47
: Performance-focused dependency configuration looks good
The dependency updates align well with the PR's performance objectives:
parking_lot
for faster synchronization primitivescompression-full
for better response compressionhttp2
support for improved connection handling- Addition of
num_cpus
for dynamic thread pool sizing
3-3
: Verify version number sequence
The version number jumps from 0.1.6 to 0.1.8, skipping 0.1.7. Please confirm if this is intentional.
✅ Verification successful
Version bump from 0.1.6 to 0.1.8 appears to be a direct update
Based on the git history, this appears to be a direct version bump from 0.1.6 to 0.1.8 along with significant dependency updates and feature enhancements. The version increment is justified by the scope of changes:
- Upgraded reqwest from 0.11 to 0.12.9 with additional features
- Enhanced multiple dependencies with new features (tokio, tower-http, tracing, etc.)
- Switched to rustls-tls and added compression features
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check git tags for version history
git tag | grep -E "v?0\.1\.[0-9]+" | sort -V
Length of output: 46
Script:
#!/bin/bash
# Let's check git commit history for version changes in Cargo.toml
git log -p -- Cargo.toml | grep -B 1 "version = "
Length of output: 1759
src/config.rs (2)
2-4
: LGTM: Appropriate imports for the new functionality
The new imports support CPU core detection and enhanced logging capabilities.
60-62
: LGTM: Comprehensive logging implementation
The logging implementation provides good visibility into the configuration loading process, including both high-level info and detailed debug information.
.github/workflows/docker-build.yml (3)
18-21
: LGTM! Well-scoped permissions
The permissions are correctly minimized to only what's needed: read for contents and write for packages.
36-46
: LGTM! Efficient build configuration
The configuration is well-structured with:
- Proper cache usage for faster builds
- Multi-platform support (amd64, arm64)
- Build-only verification for non-main branches
47-77
: LGTM! Secure GHCR configuration
The GHCR publishing setup follows security best practices:
- Uses GITHUB_TOKEN for authentication
- Implements proper tagging strategy
- Maintains build cache
src/main.rs (4)
7-7
: LGTM: Improved logging setup
Good improvements to the logging configuration:
- Using structured logging with compact format
- Proper fallback for RUST_LOG environment variable
- Clear initialization message
Also applies to: 21-27
48-54
: LGTM: Enhanced connection handling
Good improvement using into_make_service_with_connect_info
which enables access to client connection information. This will be useful for:
- Request logging with client IP
- Rate limiting per client
- Connection tracking
80-99
: LGTM: Well-implemented signal handling
Excellent implementation of the shutdown signal handler:
- Handles both CTRL+C and SIGTERM signals
- Uses tokio::select! for efficient concurrent signal handling
- Clear logging of shutdown stages
31-34
: Review logging of configuration values
While debug logging configuration is helpful, ensure no sensitive values (if any exist in AppConfig) are being logged. Consider masking sensitive values or moving this to trace level logging.
✅ Verification successful
Debug logging of configuration is acceptable
After reviewing the AppConfig struct definition, I can confirm that it only contains non-sensitive configuration values like port, host, worker threads, and network-related settings. The current debug logging of port and host is appropriate and doesn't pose any security concerns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if AppConfig contains any sensitive fields
ast-grep --pattern 'struct AppConfig {
$$$
password: $_,
$$$
}' src/config.rs
ast-grep --pattern 'struct AppConfig {
$$$
key: $_,
$$$
}' src/config.rs
ast-grep --pattern 'struct AppConfig {
$$$
secret: $_,
$$$
}' src/config.rs
Length of output: 250
Script:
#!/bin/bash
# Let's first check if the config.rs file exists and see its content
fd "config.rs" --type f --exec cat {}
# As a backup, let's also check for AppConfig struct definition in any Rust file
rg "struct AppConfig" -A 10
Length of output: 2619
src/proxy/mod.rs (1)
Line range hint 17-187
: Overall improvements look good!
The changes show good attention to performance optimization through:
- Pre-allocation of header maps
- Efficient buffer handling for streaming responses
- Improved configuration management
- Better logging for observability
The previous header handling issues have been properly addressed. Once the identified issues are fixed, this will be a solid improvement to the codebase.
.build() | ||
.expect("Failed to create HTTP client") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling for client creation
Using expect()
could crash the application if client creation fails. Consider propagating the error to the caller for better error handling.
-pub fn create_client(config: &AppConfig) -> reqwest::Client {
+pub fn create_client(config: &AppConfig) -> Result<reqwest::Client, reqwest::Error> {
info!("Creating HTTP client with optimized settings");
// ... rest of the configuration ...
- .build()
- .expect("Failed to create HTTP client")
+ .build()
}
Committable suggestion skipped: line range outside the PR's diff.
// Optimize thread count based on CPU cores | ||
let cpu_count = num_cpus::get(); | ||
debug!("Detected {} CPU cores", cpu_count); | ||
|
||
let default_workers = if cpu_count <= 4 { | ||
cpu_count * 2 | ||
} else { | ||
cpu_count + 4 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Review worker thread calculation strategy
The current worker thread calculation (cpu_count * 2
for ≤4 cores, cpu_count + 4
for >4 cores) might need adjustment:
- For single-core systems, having 2 worker threads might be excessive
- For many-core systems (32+), the current formula might create too many threads
Consider implementing a more bounded approach:
- let default_workers = if cpu_count <= 4 {
- cpu_count * 2
- } else {
- cpu_count + 4
- };
+ let default_workers = {
+ let calculated = if cpu_count <= 4 {
+ cpu_count * 2
+ } else {
+ cpu_count + 4
+ };
+ // Ensure reasonable bounds
+ calculated.clamp(1, 32)
+ };
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Optimize thread count based on CPU cores | |
let cpu_count = num_cpus::get(); | |
debug!("Detected {} CPU cores", cpu_count); | |
let default_workers = if cpu_count <= 4 { | |
cpu_count * 2 | |
} else { | |
cpu_count + 4 | |
}; | |
// Optimize thread count based on CPU cores | |
let cpu_count = num_cpus::get(); | |
debug!("Detected {} CPU cores", cpu_count); | |
let default_workers = { | |
let calculated = if cpu_count <= 4 { | |
cpu_count * 2 | |
} else { | |
cpu_count + 4 | |
}; | |
// Ensure reasonable bounds | |
calculated.clamp(1, 32) | |
}; |
buffer_size: env::var("BUFFER_SIZE") | ||
.ok() | ||
.and_then(|v| v.parse().ok()) | ||
.unwrap_or(8 * 1024), // 8KB default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate buffer_size constraints
The buffer size should be validated to ensure it's within reasonable bounds:
- Minimum size to handle typical requests
- Maximum size to prevent memory issues
buffer_size: env::var("BUFFER_SIZE")
.ok()
- .and_then(|v| v.parse().ok())
- .unwrap_or(8 * 1024), // 8KB default
+ .and_then(|v| v.parse().ok())
+ .map(|size: usize| size.clamp(4096, 32 * 1024)) // Clamp between 4KB and 32KB
+ .unwrap_or(8 * 1024), // 8KB default
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
buffer_size: env::var("BUFFER_SIZE") | |
.ok() | |
.and_then(|v| v.parse().ok()) | |
.unwrap_or(8 * 1024), // 8KB default | |
buffer_size: env::var("BUFFER_SIZE") | |
.ok() | |
.and_then(|v| v.parse().ok()) | |
.map(|size: usize| size.clamp(4096, 32 * 1024)) // Clamp between 4KB and 32KB | |
.unwrap_or(8 * 1024), // 8KB default |
tags: | | ||
magicapi1/magicapi-ai-gateway:latest | ||
magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }} | ||
cache-from: type=gha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid hardcoding Docker Hub image name
The Docker Hub image name is hardcoded, which could cause issues if the repository name changes. Consider using environment variables or GitHub context.
tags: |
- magicapi1/magicapi-ai-gateway:latest
- magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }}
+ ${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:latest
+ ${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:${{ env.CARGO_VERSION }}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
tags: | | |
magicapi1/magicapi-ai-gateway:latest | |
magicapi1/magicapi-ai-gateway:${{ env.CARGO_VERSION }} | |
cache-from: type=gha | |
tags: | | |
${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:latest | |
${{ secrets.DOCKERHUB_USERNAME }}/${{ github.event.repository.name }}:${{ env.CARGO_VERSION }} | |
cache-from: type=gha |
- name: Extract version from Cargo.toml | ||
id: version | ||
run: | | ||
VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml) | ||
echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV | ||
echo "Version found: ${VERSION}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix shell script quoting
The version extraction script needs proper quoting to handle potential spaces in paths and prevent word splitting.
run: |
- VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml)
- echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV
+ VERSION=$(awk -F '"' '/^version = / {print $2}' "Cargo.toml")
+ echo "CARGO_VERSION=${VERSION}" >> "$GITHUB_ENV"
echo "Version found: ${VERSION}"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- name: Extract version from Cargo.toml | |
id: version | |
run: | | |
VERSION=$(awk -F '"' '/^version = / {print $2}' Cargo.toml) | |
echo "CARGO_VERSION=${VERSION}" >> $GITHUB_ENV | |
echo "Version found: ${VERSION}" | |
- name: Extract version from Cargo.toml | |
id: version | |
run: | | |
VERSION=$(awk -F '"' '/^version = / {print $2}' "Cargo.toml") | |
echo "CARGO_VERSION=${VERSION}" >> "$GITHUB_ENV" | |
echo "Version found: ${VERSION}" |
🧰 Tools
🪛 actionlint
28-28: shellcheck reported issue in this script: SC2086:info:2:36: Double quote to prevent globbing and word splitting
(shellcheck)
info!("Setting up TCP listener with non-blocking mode"); | ||
let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address"); | ||
tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking"); | ||
|
||
debug!("Converting to tokio TCP listener"); | ||
let listener = tokio::net::TcpListener::from_std(tcp_listener) | ||
.expect("Failed to create Tokio TCP listener"); | ||
|
||
info!( | ||
address = %addr, | ||
"Starting server" | ||
"AI Gateway listening on {}:{} with {} worker threads", | ||
config.host, config.port, config.worker_threads | ||
); | ||
|
||
match tokio::net::TcpListener::bind(addr).await { | ||
Ok(listener) => { | ||
info!("Server successfully bound to address"); | ||
if let Err(e) = axum::serve(listener, app).await { | ||
error!(error = %e, "Server error occurred"); | ||
std::process::exit(1); | ||
} | ||
} | ||
Err(e) => { | ||
error!(error = %e, "Failed to bind server to address"); | ||
axum::serve(listener, app) | ||
.with_graceful_shutdown(shutdown_signal()) | ||
.await | ||
.unwrap_or_else(|e| { | ||
error!("Server error: {}", e); | ||
std::process::exit(1); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider enhancing error handling
While the error handling has improved with unwrap_or_else
, consider implementing a more graceful shutdown:
- Log all error details including context
- Attempt to clean up resources
- Use a proper error code based on the failure type
- .unwrap_or_else(|e| {
- error!("Server error: {}", e);
- std::process::exit(1);
- });
+ .unwrap_or_else(|e| {
+ error!("Server error: {:#}", e);
+ error!("Shutting down due to critical error");
+ // Attempt cleanup
+ std::process::exit(match e.kind() {
+ std::io::ErrorKind::AddrInUse => 2,
+ std::io::ErrorKind::PermissionDenied => 3,
+ _ => 1,
+ });
+ });
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
info!("Setting up TCP listener with non-blocking mode"); | |
let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address"); | |
tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking"); | |
debug!("Converting to tokio TCP listener"); | |
let listener = tokio::net::TcpListener::from_std(tcp_listener) | |
.expect("Failed to create Tokio TCP listener"); | |
info!( | |
address = %addr, | |
"Starting server" | |
"AI Gateway listening on {}:{} with {} worker threads", | |
config.host, config.port, config.worker_threads | |
); | |
match tokio::net::TcpListener::bind(addr).await { | |
Ok(listener) => { | |
info!("Server successfully bound to address"); | |
if let Err(e) = axum::serve(listener, app).await { | |
error!(error = %e, "Server error occurred"); | |
std::process::exit(1); | |
} | |
} | |
Err(e) => { | |
error!(error = %e, "Failed to bind server to address"); | |
axum::serve(listener, app) | |
.with_graceful_shutdown(shutdown_signal()) | |
.await | |
.unwrap_or_else(|e| { | |
error!("Server error: {}", e); | |
std::process::exit(1); | |
} | |
}); | |
info!("Setting up TCP listener with non-blocking mode"); | |
let tcp_listener = std::net::TcpListener::bind(addr).expect("Failed to bind address"); | |
tcp_listener.set_nonblocking(true).expect("Failed to set non-blocking"); | |
debug!("Converting to tokio TCP listener"); | |
let listener = tokio::net::TcpListener::from_std(tcp_listener) | |
.expect("Failed to create Tokio TCP listener"); | |
info!( | |
"AI Gateway listening on {}:{} with {} worker threads", | |
config.host, config.port, config.worker_threads | |
); | |
axum::serve(listener, app) | |
.with_graceful_shutdown(shutdown_signal()) | |
.await | |
.unwrap_or_else(|e| { | |
error!("Server error: {:#}", e); | |
error!("Shutting down due to critical error"); | |
// Attempt cleanup | |
std::process::exit(match e.kind() { | |
std::io::ErrorKind::AddrInUse => 2, | |
std::io::ErrorKind::PermissionDenied => 3, | |
_ => 1, | |
}); | |
}); |
Ok(response_builder | ||
.body(Body::from_stream(stream)) | ||
.unwrap()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace unwrap() with proper error handling
Using unwrap()
could cause panics if the response builder fails. Consider proper error handling using map_err()
.
Apply this diff for streaming response:
- Ok(response_builder
- .body(Body::from_stream(stream))
- .unwrap())
+ response_builder
+ .body(Body::from_stream(stream))
+ .map_err(|e| AppError::ResponseBuildError(e.to_string()))
And for regular response:
- Ok(response_builder
- .body(Body::from(body))
- .unwrap())
+ response_builder
+ .body(Body::from(body))
+ .map_err(|e| AppError::ResponseBuildError(e.to_string()))
Also applies to: 185-187
) -> Result<Response<Body>, AppError> { | ||
debug!("Preparing to send request: {} {}", method, url); | ||
|
||
let body_bytes = body::to_bytes(body, usize::MAX).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a configurable body size limit
Using usize::MAX
as the body size limit could lead to memory exhaustion with large payloads. Consider using a configurable limit from the AppConfig
.
Apply this diff:
- let body_bytes = body::to_bytes(body, usize::MAX).await?;
+ let body_bytes = body::to_bytes(body, config.buffer_size).await?;
Committable suggestion skipped: line range outside the PR's diff.
Use Thread and various performance improvements
Use Thread and various performance improvements
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes
Chores
.dockerignore
file to streamline the Docker build process.