Revision as of 13:39, 11 January 2023

What Problems Does SafeSquid’s Performance Plot

SafeSquid’s Performance Plot is a convenient way of visually displaying the data collected by SafeSquid’s performance log , to show information that represent quantitative data collected over a specific subject and a specific time interval.

SafeSquid’s performance plot are used to:

Extract past system utilization data which are unable to be retrieved from SafeSquid’s CLI without installing third part application or creating a system monitoring script.

Native applications such as top, free, vmstat, sar etc show real time system resource usage data.

Example: Your monitoring application has sent an alert for increased memory consumption for past couple of hours. Using SafeSquid’s performance plot, you can generate a plot for the last hours and analyze the behavior of your proxy server and identify memory utilization.

Understanding the pattern to better analyze the problem

Suppose your proxy server receives intense system loads at random intervals; however, you are unable to notice a pattern as the intense system usage occurs at random and works normal during most days.

This random intense system usage has some times caused issues during our critical working hours.

Issue with intermittent system loads can easily be gone unnoticed for months.

One of the most noticeable symptoms is that the system may become slow and unresponsive.

This can occur if the system is unable to keep up with the demands being placed on it, and it can cause delays or timeouts when trying to complete requests. If problem of intense system usage goes unnoticed for long and constant enough then the system may crash or become unstable.

This can happen if the load causes the system to run out of memory or other resources, or if it causes the system to overheat. Typically, higher system resources utilization are observed when SafeSquid is requested to handle large number of concurrent connections and the system resources are deficient.

As SafeSquid is a multi threaded application it opens new thread for every active connection. With limited resources and high volume of concurrent connections leads to higher CPU utilization. Another reason for higher CPU usage could be you running multiple programmers’ simultaneously, the CPU will work hard to switch between programs which can cause intense utilization of CPU's.

It is important to ensure that you have proper monitoring and logging in place to detect and diagnose issues with the system, so that you can take corrective action as soon as possible.

Helps reduce response time when requesting for support.

If you are creating a support ticket for problems such as increased system utilization, slow response time, too many open connections, etc.

You can generate the performance plot and attach the plot with the support ticket. Helping you reduce response time and better assist the support technician to solve your problem.

Performance plot helps the support staff understand the root cause of the problem.

Default Process of Creating Performance Plot

SafeSquid’s web interface http://safesquid.cfg has an option to generate performance plot using time frequency.

Performance plot generated from SafeSquid’s web interface is snail paced and if selected time frequency contains loads of user log data, then generation time can take forever.

This can cause delay in critical troubleshooting process.

Automating plot generation based on set time intervals is not possible from web interface, example: plots to be generate every 1 hours or every day is not possible from web interface.

Performance plots generated from SafeSquid’s CLI is a better option as the generation time is comparatively lower than plots generate via SafeSquid’s web interface.

However, for using this script you will be needing to provide argument which are start time and end time using environment variables.

Getting required start time and end time can sometimes be challenging as the timestamp used in performance log is not easy to understand.

Creating Performance Plot using genPlot.sh Script

Using genPlot.sh, creation of performance plots is automated and based on set time intervals genPlot.sh generates performance plots and stores in appropriate folder.

Monit helps to automated the process of generating performance plot on custom frequency.

Note: Generation of performance plot for past 1 hour takes about 7-8 minutes using genPlot.sh script Example: You can generate performance plot every 1 hour, day, week, fortnight, month, year

You can also use genPlot.sh as a standalone script to generate reports as required.

Automating the Process of Performance Plot Creation

To automate the process of plot creation on custom time follow below steps:

Installation

Download the genPlot.tar.gz file

 wget https://<Download Location> -O /tmp/ ; tar -xzvf /tmp/genPlot.tar.gz -C /usr/local/src/

Edit plot.monit file and comment the time frequency which will not be used to generate plot

After updating copy plot.monit file to /etc/monit/conf.d/

 cp /usr/local/src/plot.monit /etc/monit/conf.d/

Copy genPlot.sh script to /usr/local/bin/

 cp /usr/local/src/genPlot.sh /usr/local/bin/

Add execute permissions

 chmod 755 /usr/local/bin/genPlot.sh

Check the Monit control file and reload Monit

 monit -t && monit reload

Performance plot will be generated every hour, day and week as per our plot.monit file.

Validating

To view the logs for plot creation check your /var/log/monit.log file

Note: Do not check for logs immediately after setting up the scripts, wait for at least couple of hours and then check for logs

 grep -E "PERFORMANCE_PLOT_EVERY_(HOUR|DAY|WEEK)" /var/log/monit.log

How to view generated plots

Using a Web-Server

To view generate plot in your browser install Apache web server.

To install Apache web server run below command

 apt install apache2

edit the /etc/apache2/sites-enabled/000-default.conf

 vim /etc/apache2/sites-enabled/000-default.conf

Update the document root from /var/www/html to /var/www/safesquid

Now reload the site configurations using below command

 a2dissite 000-default.conf && systemctl reload apache2 && a2ensite 000-default.conf && systemctl reload apache2

Now open your browser access the webserver using servers IP address

On a local Machine

For users without a webserver, you can access the generated performance plot from /var/www/safesquid/performance_plot location.

To view all plots created run below command.

tree -af /var/www/safesquid

Based on the frequency set you can access the folder and view your performance log.

For example: users who have set Monit to configure plot every hour will find the performance plot to be located inside folder Every_Hour

Copy the files to your location machine and using any image view you can view your performance plots.

Generate plot as required

Usage

To generate report as required run command

 genPlot.sh <options>
 Options: Hour, Day, Week, Fortnight, Month, Year
 Example: genPlot.sh Hour

Example command will generate performance plot of last 1 hour

@@ Line 1: / Line 1: @@
 = What Problems Does SafeSquid’s Performance Plot =
-SafeSquid’s performance plot helps analyze problems such as:
-* Higher CPU utilization observed recently.
-Typically, Higher CPU utilization are observed when SafeSquid is requested to handle large number of concurrent connections where system requirements do not meeting the requirements for handling such heavy traffic.
+SafeSquid’s Performance Plot is a convenient way of visually displaying the data collected by SafeSquid’s performance log , to show information that represent quantitative data collected over a specific subject and a specific time interval.
-As SafeSquid is a multi-threaded application, it opens new threads for every active connection, with limited resources and high volume of concurrent connections CPU utilization gets increased.
+SafeSquid’s performance plot are used to:
-Running Multiple programmers’ simultaneously, the CPU will work hard to switch between programs which can cause higher CPU utilization.
+* Extract past system utilization data which are unable to be retrieved from SafeSquid’s CLI without installing third part application or creating a system monitoring script.
-* Increase in system load average.
+Native applications such as top, free, vmstat, sar etc show real time system resource usage data.
-Example: for a server with 8 CPU's, the load average for 1-minute tops around 7.9, 8.0 and at times it reaches 8.5.
+Example: Your monitoring application has sent an alert for increased memory consumption for past couple of hours.
-Because of higher CPU utilization load average of the proxy server increases.
+Using SafeSquid’s performance plot, you can generate a plot for the last hours and analyze the behavior of your proxy server and identify memory utilization.
-For a system with 8 CPU’s if the load average is above 8 then it means process are waiting for CPU resources.
+* Understanding the pattern to better analyze the problem
-End users are affected with slow response time if the waiting process happens to be a SafeSquid connection.
+Suppose your proxy server receives intense system loads at random intervals; however, you are unable to notice a pattern as the intense system usage occurs at random and works normal during most days.
-* Proxy server is running low on Memory, check the memory utilization for past hours.
+This random intense system usage has some times caused issues during our critical working hours.
-If the memory assigned for the workload is insufficient then the program will use all available memory, leading to higher memory utilization.
+Issue with intermittent system loads can easily be gone unnoticed for months.
-Running multiple tasks causes the memory to overload and increase in utilization.
+One of the most noticeable symptoms is that the system may become slow and unresponsive.
-* loading time of webpages are slower than before.
+This can occur if the system is unable to keep up with the demands being placed on it, and it can cause delays or timeouts when trying to complete requests.
+If problem of intense system usage goes unnoticed for long and constant enough then the system may crash or become unstable.
-Due to higher volume of user traffic, request and response time gets a toll, because of which end users witness the problem of slow response time.
+This can happen if the load causes the system to run out of memory or other resources, or if it causes the system to overheat.
+Typically, higher system resources utilization are observed when SafeSquid is requested to handle large number of concurrent connections and the system resources are deficient.
-* Observed few disconnections during browsing session, check whether the proxy server has restarted recently and if so, when?
+As SafeSquid is a multi threaded application it opens new thread for every active connection.
+With limited resources and high volume of concurrent connections leads to higher CPU utilization.
+Another reason for higher CPU usage could be you running multiple programmers’ simultaneously, the CPU will work hard to switch between programs which can cause intense utilization of CPU's.
-Disconnects during browsing session can cause bad user experience.
+It is important to ensure that you have proper monitoring and logging in place to detect and diagnose issues with the system, so that you can take corrective action as soon as possible.
-Restarts for maintenance activities such as updating proxy version, updating expired SSL certificates generated by SafeSquid for HTTPS inspection etc.
+* Helps reduce response time when requesting for support.
-SafeSquid service crash can also cause the proxy server to restart.
+If you are creating a support ticket for problems such as increased system utilization, slow response time, too many open connections, etc.
-* Unable to access websites using its domain name, check the number of failed DNS queries for past 2 hours.
+You can generate the performance plot and attach the plot with the support ticket.
+Helping you reduce response time and better assist the support technician to solve your problem.
-Using name servers which are unable to resolve domain.
+Performance plot helps the support staff understand the root cause of the problem.
-Incorrect DNS configurations.
-Firewall or security application misconfiguration can also cause such problems.
-* Outgoing connections are failing and the reason is yet unknown, please validate if outgoing connections from SafeSquid are failing.
-Failures in outgoing connections such as network congestion where outgoing connections fail due to lack of available bandwidth, DNS failures, Firewall or security software can block outgoing connections.
-* The number of users has increased; we need to check if the server is able to handle additional users or do we need to upgrade system resources.
-Increase in number of users can have multiples of reasons, hiring new employees, seasonal demand, changes in market conditions, replacement of existing employees, etc.
-Performance plot solves the problem of conveying information that is too complex or extensive to state in the logs.
-Plot is a convenient way of visually displaying the data collected by SafeSquid’s performance log to show information that represent quantitative data collected over a specific subject and a specific time interval.
-Performance plot helps you understand and derive meaning from large chunks of data.
-Graphical rendering of plot data is an analyzed records on a progressive per line basis.
-Records such as Elapsed Time, Client Connections Handled, Client Connections Closed, Client Transactions Handled, Client Connections in Pool etc are used.
-Performance plot comes handy during troubleshooting sessions; plot contains information such as system resources, SafeSquid process health status and other details which would otherwise have taken multiple commands to gathering the same amount of information.
-Plot also provide performance metrics to identify any outage due to resource shortfall, or failure in Internet Connectivity, or surge in web-traffic, etc.
-Example use-case: For issues related with system resources or in situations where we notice increased load average, then using performance plot you can identify the root cause of the problem.
 = Default Process of Creating Performance Plot =

Difference between revisions of "Generating Performance Plot & Automating the Process"