Visualizing 2023 Web Traffic Trends Through ATI's Lens
In the constantly shifting digital landscape, accurately tracking and interpreting web traffic trends is crucial for the network operators and equipment vendors. The ATI group has devised a new automated web-crawler to collect traffic from the most popular websites on monthly basis during 2023. We've concentrated on dissecting the monthly traffic patterns of the top 50 websites, aiming to unravel the complex interplay of network and application protocol trends. This deep dive into the data provides an overall view of 2023's network traffic trends and technology stacks. Let’s dive into this detailed exploration as we transform the intricate web of network data into actionable technical insights.
Data Acquisition Methods
In our comprehensive analysis, we focused on the web traffic data corresponding to the top 50 websites listed by SimilarWeb as of January 2023. SimilarWeb have filled the void left when Alexa shut down last year, though it is in competition with CloudFlare’s Radar project. The methodology involved systematic monthly crawls of these websites, executed from a server located in the US-East region.
The crawling process utilized in-house techniques, augmented by the state-of-the-art Keysight UI/UX automation software, Eggplant. This combination ensured a thorough and efficient data collection process.
We rigorously conducted monthly crawls of the entire list of top websites, systematically recording the network traffic data generated in each session. In this process we have crawled each individual website for a specific depth in a DFS (Depth First Search) fashion ensuring maximum coverage. This consistent and detailed approach has enabled us to gather a rich dataset, providing valuable insights into the web traffic trends and patterns over the year.
Key Findings
The study examined 91,078 TLS handshakes and a total of 1,018,932 HTTP request-response packets. This report will primarily concentrate on HTTP/1, HTTP/2, and HTTP/3 traffic, as they represent a significant portion of internet traffic.
Traffic Insights
A significant portion of the traffic, 73.2%, was HTTP/2, followed by the emerging HTTP/3 at 18.46%, with the remainder being HTTP/1. Notably, all the observed traffic was HTTPS, indicating a marked decline in the use of plaintext HTTP.
Fig 1: Distribution of HTTP Versions
The most observed content type was images indicating modern sophisticated web layouts, with the highest count of individual files. HTML files ranked second in file count, followed by JavaScript files. JSON files, which are typically used for data exchange in backend communications, were also present in considerable numbers. These findings provide insights into the distribution of content types within the dataset, highlighting the prevalence of image files while noting the significance of JSON files in backend interactions.
Fig 2 : Distribution of Content Types by File Count
Server Analysis
Nginx emerged as the predominant server, significantly surpassing other server types in frequency. It played a prominent role, with Amazon S3 and Apache servers following suit in terms of prevalence. This distribution of servers within the dataset highlights Nginx's notable presence in the observed web traffic.
Fig 3: Distribution of Server Stack Technologies
Additionally, in some instances, a generic server keyword was detected, indicating cases where specific server information was either not provided or intentionally masked for security or privacy reasons. This could be done by proxy or simply a configuration setting.
The distribution of traffic by server location reveals that the majority of the traffic originated from servers located in the United States (US). Additionally, a significant portion of the traffic came from servers in Canada (CA), while Japanese (JP) servers and Russian (RU) servers also played notable roles in serving the observed web traffic.
Fig 4: Distribution of Traffic by Server Location
It's essential to note that these statistics are based on observed traffic from a client situated in the US East 2 region and encompass the top 50 websites. The precise server locations were determined using a third-party service, Geoapify, to determine server location from observed IP addresses.
Encryption Analysis
TLS 1.3, constituting 65.84% of the traffic and including both HTTPS and QUIC, reflects a contemporary and secure approach to data encryption. Concurrently, TLS 1.2 makes up 34.16% of the traffic, coexisting with TLS 1.3, demonstrating a mix of encryption protocols in use.
Fig 5: Distribution of TLS version
The cipher suite distribution presents "TLS_AES_128_GCM_SHA256" as the most prevalent, occupying 40% of usage. "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256" follows with 22%, indicating its significant adoption for secure communications.
Fig 5: Distribution of Cipher-Suites
In contrast, "TLS_AES_256_GCM_SHA384" at 18% and "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384" at 7% reflect a choice for enhanced encryption strength. An "OTHER" category encompasses a diverse array of suites, collectively representing 14% of the usage spectrum.
The distribution reflects a balance between performance, security, and the need for stronger encryption across different applications.
Conclusion
The year-long analysis of top websites in 2023 reveals several significant trends. The web is steadily evolving to balance innovation with reliability, which is a critical insight for network infrastructure stakeholders. This trend indicates a growing need for adaptable solutions that support both new advancements and established protocols. By doing so, it ensures robust, secure, and efficient connectivity in a rapidly changing digital landscape. This evolving trend is pivotal in guiding the development and validation of networking technologies, effectively shaping a future-ready internet infrastructure.
Testing Network Resilience with BreakingPoint
In today's rapidly evolving digital landscape, testing your network equipment against the latest digital trends is crucial to ensuring optimal performance and security. The continuous advancement of technology introduces new challenges and opportunities, making it essential to evaluate how your network equipment handles real-world scenarios.
Fig: Keysight BreakingPoint
The BreakingPoint offers unique capabilities like blending a wide range of diverse application traffic, to craft a true-to-life network traffic simulation that flows through your network equipment. For more details about Keysight BreakingPoint and to test your network equipment against the most updated network traffic available in the internet visit BreakingPoint.