Introduction
Efficient web server log management is crucial for maintaining your website’s performance, troubleshooting issues, and gaining insight into user behavior. Apache is one of the most popular web servers. It generates access and error logs that contain valuable information. To effectively manage and analyze these logs, you can use Logstash to process and send them to DigitalOcean’s managed OpenSearch for indexing and visualization.
In this tutorial, we will guide you through installing Logstash on your Droplet, configuring it to collect Apache logs, and sending them to Managed OpenSearch for analysis.
Prerequisites
- Droplet/s with Apache web server installed.
- Managed OpenSearch Cluster
Step 1 – Install Logstash
Logstash can be installed using binary files or via package repositories. For easier management and updates, using package repositories is generally recommended.
In this section, we will guide you through installing Logstash on your Droplet using APT and YUM Package Manager.
Let's identify the operating system:
cat /etc/os-release
For APT-based systems (Ubuntu/Debian)
Download and install the public signing key:
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
You may need to install the apt-transport-https package on Debian before continuing:
sudo apt-get install apt-transport-https
Save the repository definition in /etc/apt/sources.list.d/elastic-8.x.list:
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
To add the Logstash repository, use the echo method described above. Do not use add-apt-repository as it will also add a deb-src entry, but we do not provide a source package. If you have added a deb-src entry, you will see the following error:
Unable to find expected entry 'main/source/Sources' in Release file (Wrong sources.list entry or malformed file)
Just remove the deb-src entry from the /etc/apt/sources.list file and the installation should work as expected.
Run sudo apt-get update and the repository is ready to use. You can use it with:
sudo apt-get update && sudo apt-get install logstash
For YUM-based systems (CentOS/RHEL)
Download and install the public signing key:
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
Add the following to your /etc/yum.repos.d/logstash.repo file. You can use "t" to update and create the file.
sudo tee /etc/yum.repos.d/logstash.repo > /dev/null <<EOF
[logstash-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOFYour tank is ready to use. You can use it with:
sudo yum install logstash
For more information, please refer to the Logstash installation guide.
Step 2 – Configure Logstash to send logs to OpenSearch
The Logstash pipeline consists of three main stages: input, filtering, and output. Logstash pipelines use plugins. You can use community plugins or create your own plugins.
- Input: This stage collects data from various sources. Logstash supports multiple input plugins to manage data sources such as log files, databases, message queues, and cloud services.
- Filter: This stage processes and transforms the data collected in the input stage. Filters can modify, enrich, and structure the data to make it more useful and analytical.
- Output: This stage sends the processed data to a destination. Destinations can include databases, files, and data stores such as OpenSearch.
Step 3 – Install the Open Search Output plugin
The OpenSearch output plugin can be installed by running the following command:
/usr/share/logstash/bin/logstash-plugin install logstash-output-opensearch
More information can be found in this logstash-output-opensearch-plugin repository.
Now let's create a pipeline:
Create a new file in the /etc/logstash/conf.d/ path called apache_pipeline.conf and copy the following contents.
input {
file {
path => "/var/log/apache2/access.log"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => "apache_access"
}
file {
path => "/var/log/apache2/error.log"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => "apache_error"
}
}
filter {
if "apache_access" in [tags] {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
}
mutate {
remove_field => [ "message","[log][file][path]","[event][original]" ]
}
} else {
grok {
match => { "message" => "%{HTTPD24_ERRORLOG}" }
}
}
}
output {
if "apache_access" in [tags] {
opensearch {
hosts => "https://<OpenSearch-Hostname>:25060"
user => "doadmin"
password => "<your_password>"
index => "apache_access"
ssl_certificate_verification => true
}
} else {
opensearch {
hosts => "https://<OpenSearch-Hostname>:25060"
user => "doadmin"
password => "<your_password>"
index => "apache_error"
ssl_certificate_verification => true
}
}
}with the hostname of your OpenSearch server and Replace with your OpenSearch password.
Let's break down the configuration above.
- INPUT: Used to configure a source for events. The file input plugin is used here.
- path => “/var/log/apache2/access.log”: Specifies the path to the Apache access log file that Logstash should read from.
Make sure the Logstash service has access to the input path.
- start_position => “beginning”: Specifies where Logstash should start reading the log file. “beginning” indicates that Logstash should start processing the file from the beginning, not the end.
- sincedb_path => “/dev/null”: Specifies the path to the sincedb file. Sincedb files are used by Logstash to keep track of the current position in the log files, allowing it to resume from where it left off in the event of a reboot or crash.
- tags => “apache_access”: Assigns a tag to events read from this input. Tags are useful for identifying and filtering events in Logstash, often used downstream in output stages or configuration filtering. We use tags for the latter.
- FILTER: Used to process events.
Starting with conditionals:
(if "apache_access" in [tags]):
This checks if the apache_access tag is present in the [tags] section of the incoming log events. We use this condition to apply the appropriate GROK filter to Apache access and error logs.
- Grok filter (for Apache access logs):
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
}The grok filter %{HTTPD_COMBINEDLOG} is a predefined pattern in Logstash used to parse the Apache Combined Access Log format. It extracts fields such as IP address, timestamp, HTTP method, URI, status code, etc. from the message body of received events.
- Mutate Filter Remove (optional): After parsing the Apache logs, we use mutate-remove to remove some fields.
mutate {
remove_field => [ "message","[log][file][path]","[event][original]" ]
}- Another condition: If the apache_access tag is not in [tags], the else block is executed. This else block contains another GROK filter for Apache error logs.
grok {
match => { "message" => "%{HTTPD24_ERRORLOG}" }
}
This %{HTTPD24_ERRORLOG} filter parses messages that match the Apache error log format. It extracts fields related to error logs such as timestamp, log level, error message, etc.
You can find GROK patterns at: https://github.com/logstash-plugins/logstash-patterns-core/tree/main/patterns Find.
- OUTPUT: The plugin sends output events to a specific destination.
The output block starts with an if condition. We are using the conditional statement here.
if "apache_access" in [tags] {}
If a condition is used to route logs to OpenSearch to two separate profiles, apache_error and apache_access.
Let's check out the OpenSearch Output plugin:
hosts => "https://XXX:25060" Your Open search Hostname
user => "doadmin" Your Open search Username
password => "XXXXX" OpenSearch Password
index => "apache_error" Index name in OpenSearch
ssl_certificate_verification => true Enabled SSL certificate verificationStep 4 – Start Logstash
After configuring the pipeline, start the Logstash service:
systemctl enable logstash.service
systemctl start logstash.service
systemctl status logstash.serviceStep 5 – Troubleshooting
Check the connection.
You can verify that Logstash can connect to OpenSearch by testing the connection:
curl -u your_username:your_password -X GET "https://your-opensearch-server:25060/_cat/indices?v"
with the hostname of the OpenSearch server and , Replace with your OpenSearch credentials.
Data ingestion
Ensure that the data is properly indexed in OpenSearch:
curl -u your_username:your_password -X GET "http://your-opensearch-server:25060/<your-index-name>/_search?pretty"
with the hostname of the OpenSearch server and , Replace with your OpenSearch credentials. Similarly, With the name Index.
Firewall and network configuration
Make sure your firewall rules and network settings allow traffic between Logstash and OpenSearch on port 25060.
Logs
Logs related to Logstash can be found in /var/log/logstash/logstash-plain.log.
Result
In this guide, we set up Logstash to collect and send Apache logs to OpenSearch. Here's a quick summary of what we covered:
Installing Logstash: We explained how to use the APT or YUM package managers, depending on your Linux distribution, to install Logstash on your Droplet.
Logstash Configuration: We created and configured the Logstash configuration file to ensure that Apache logs are properly parsed and sent to OpenSearch.
Verification in OpenSearch: We set up a listing template in OpenSearch dashboards to verify that your reports are properly indexed and visible for analysis.
With these steps completed, you should now have a working setup where Logstash collects Apache logs and sends them to OpenSearch.









