Project Crux Documentation: Difference between revisions
Created page with "= Introduction = Project Crux is a Linux-based LAMP stack deployment on Microsoft Azure focused on building, securing, monitoring, and documenting a working cloud-hosted web application environment. In this document, I will go over how I deployed and configured the Azure virtual machine, networking, Network Security Groups, Apache, MariaDB, PHP, and MediaWiki, along with HTTPS, Azure Monitor, Log Analytics, alerting, Azure Key Vault, cost tracking, and the troubleshoot..." |
No edit summary |
||
| (One intermediate revision by the same user not shown) | |||
| Line 32: | Line 32: | ||
The screenshot below shows the Project Crux Azure architecture, including segmented networking, VM placement, Key Vault integration, and monitoring components. | The screenshot below shows the Project Crux Azure architecture, including segmented networking, VM placement, Key Vault integration, and monitoring components. | ||
[[File:Diagram.png|thumb|683x683px|Project Crux Azure architecture diagram detailing the segmented networking, VM placement, Key Vault integration, and monitoring components.]] | |||
= Design Decisions = | = Design Decisions = | ||
| Line 41: | Line 42: | ||
In the end, I picked MediaWiki because it matched the requirements better and because it was the platform I was already the most familiar with. | In the end, I picked MediaWiki because it matched the requirements better and because it was the platform I was already the most familiar with. | ||
== Decision 2) Public and Private Subnets in a Single VNet == | == Decision 2) Public and Private Subnets in a Single VNet == | ||
| Line 50: | Line 49: | ||
Below is a screenshot showing the two subnets that were created for the environment. | Below is a screenshot showing the two subnets that were created for the environment. | ||
[[File:Image2.png|thumb|735x735px|The public and private subnets created for the project environment within the virtual network]] | |||
== Decision 3) Key Vault Instead of Plain-Text Credential Storage == | == Decision 3) Key Vault Instead of Plain-Text Credential Storage == | ||
| Line 57: | Line 57: | ||
Key Vault access was managed using Azure role-based access control (RBAC) rather than legacy vault access policies. Since the vault was configured for RBAC, the Access policies blade was intentionally unavailable. To manage the secrets, I assigned myself the permissions needed through Access control (IAM), which allowed me to create, view, and manage the stored credentials in the vault. | Key Vault access was managed using Azure role-based access control (RBAC) rather than legacy vault access policies. Since the vault was configured for RBAC, the Access policies blade was intentionally unavailable. To manage the secrets, I assigned myself the permissions needed through Access control (IAM), which allowed me to create, view, and manage the stored credentials in the vault. | ||
[[File:Images3.png|thumb|655x655px|The database and MediaWiki passwords saved securely as secrets in the Azure Key Vault.]] | |||
In the end, I used Key Vault to store the MediaWiki and database-related passwords instead of leaving them only on the VM. Below is a screenshot showing the secrets that were saved in the vault. | In the end, I used Key Vault to store the MediaWiki and database-related passwords instead of leaving them only on the VM. Below is a screenshot showing the secrets that were saved in the vault. | ||
| Line 77: | Line 77: | ||
= Security Posture = | = Security Posture = | ||
Security in this deployment was handled using multiple layers instead of relying on one setting alone. At the Azure network layer, I created NSG rules that allowed only the traffic required for management and application access. SSH was restricted to my own source IP instead of being left open to the internet, while HTTP and HTTPS were allowed for web access. Everything else remained blocked by default. This reduced the exposed attack surface and kept the VM aligned with the project requirement that SSH must not be open to 0.0.0.0/0. | Security in this deployment was handled using multiple layers instead of relying on one setting alone. At the Azure network layer, I created NSG rules that allowed only the traffic required for management and application access. SSH was restricted to my own source IP instead of being left open to the internet, while HTTP and HTTPS were allowed for web access. Everything else remained blocked by default. This reduced the exposed attack surface and kept the VM aligned with the project requirement that SSH must not be open to 0.0.0.0/0. | ||
[[File:Image.png|thumb|682x682px|The public Network Security Group (NSG) demonstrating the custom inbound and outbound security rules]] | |||
At the operating system layer, I hardened SSH by disabling password authentication and disabling root login, which left key-based authentication as the only allowed sign-in method. I also created and used a non-root administrative user for remote access instead of managing the server directly as root. In addition to the Azure NSG, I enabled UFW as a host-based firewall to provide another layer of protection in case a cloud-side rule was changed or misconfigured later. | At the operating system layer, I hardened SSH by disabling password authentication and disabling root login, which left key-based authentication as the only allowed sign-in method. I also created and used a non-root administrative user for remote access instead of managing the server directly as root. In addition to the Azure NSG, I enabled UFW as a host-based firewall to provide another layer of protection in case a cloud-side rule was changed or misconfigured later. | ||
| Line 107: | Line 107: | ||
=== Nmap Interpretation === | === Nmap Interpretation === | ||
[[File:Image5.png|thumb|671x671px|The Virtual Network (VNet) configured for the project, showing the network layout and security-related settings]] | |||
The final Nmap scans showed that the exposed network surface matched the intended design. Under normal configuration, only ports 22/tcp, 80/tcp, and 443/tcp were visible externally, which aligned with the required management and web access for the project. No database service was exposed publicly, which confirmed that MariaDB was correctly kept local to the VM and bound to localhost only. I also tested the HTTPS rule by temporarily changing the NSG rule from Allow to Deny and then scanning port 443 specifically. Nmap reported the port as filtered, which confirmed that the NSG rule change was being enforced as expected. Overall, the scans showed that the NSG and host firewall configuration matched the intended security posture and did not leave unnecessary services reachable from the internet. | The final Nmap scans showed that the exposed network surface matched the intended design. Under normal configuration, only ports 22/tcp, 80/tcp, and 443/tcp were visible externally, which aligned with the required management and web access for the project. No database service was exposed publicly, which confirmed that MariaDB was correctly kept local to the VM and bound to localhost only. I also tested the HTTPS rule by temporarily changing the NSG rule from Allow to Deny and then scanning port 443 specifically. Nmap reported the port as filtered, which confirmed that the NSG rule change was being enforced as expected. Overall, the scans showed that the NSG and host firewall configuration matched the intended security posture and did not leave unnecessary services reachable from the internet. | ||
| Line 129: | Line 130: | ||
The screenshot below shows the Azure DNS zone and record set used to make crux.silkspace.net resolve to the Project Crux VM. | The screenshot below shows the Azure DNS zone and record set used to make crux.silkspace.net resolve to the Project Crux VM. | ||
[[File:Image4.png|thumb|875x875px|Internal terminal validation confirming MariaDB is listening on 127.0.0.1:3306, UFW is active with required rules, and SSH is restricted to key-based authentication]] | |||
The screenshot below shows the Let’s Encrypt certificate issued for crux.silkspace.net, confirming that the site was secured over HTTPS with a valid certificate. | The screenshot below shows the Let’s Encrypt certificate issued for crux.silkspace.net, confirming that the site was secured over HTTPS with a valid certificate. | ||
| Line 153: | Line 154: | ||
* Evaluation period: 5 minutes | * Evaluation period: 5 minutes | ||
* Action Group: ag-crux / hudson.silk@itas.ca | * Action Group: ag-crux / hudson.silk@itas.ca | ||
* Why this threshold was chosen: I configured a disk performance alert so I could | * Why this threshold was chosen: I configured a disk performance alert so I could detect heavy storage activity before disk contention affected MediaWiki or MariaDB performance. | ||
[[File:Image6.png|thumb|653x653px|Detailed external Nmap scan confirming that only the expected ports—22/tcp (SSH), 80/tcp (HTTP), and 443/tcp (HTTPS)—are externally reachable]] | |||
Below is a screenshot showing the alert rules I created while setting up monitoring and learning how Azure alerting worked. | Below is a screenshot showing the alert rules I created while setting up monitoring and learning how Azure alerting worked. | ||
| Line 195: | Line 196: | ||
=== KQL Query 2: Critical admin and service activity === | === KQL Query 2: Critical admin and service activity === | ||
[[File:Image76.png|thumb|717x717px|Basic Nmap scan confirming that only ports 22, 80, and 443 are open, while all other scanned ports are filtered]] | |||
This query shows recent administrative and service-level activity affecting the VM, including SSH, sudo usage, Apache, and MariaDB events. I would use it during an incident to quickly see whether a service failed, restarted, or was modified by an administrator while troubleshooting availability problems. | This query shows recent administrative and service-level activity affecting the VM, including SSH, sudo usage, Apache, and MariaDB events. I would use it during an incident to quickly see whether a service failed, restarted, or was modified by an administrator while troubleshooting availability problems. | ||
| Line 232: | Line 234: | ||
| project TimeGenerated, Computer, Facility, ProcessName, SeverityLevel, SyslogMessage | | project TimeGenerated, Computer, Facility, ProcessName, SeverityLevel, SyslogMessage | ||
[[File:Image8.png|thumb|Temporary modification of the NSG Allow-HTTPS rule to deny inbound TCP 443 traffic for external exposure testing]] | |||
| order by TimeGenerated desc | | order by TimeGenerated desc | ||
| Line 291: | Line 293: | ||
== Cost Analysis == | == Cost Analysis == | ||
The biggest cost driver in this deployment was expected to be the VM itself, especially the compute and storage tied to it. Even though the VM size was still fairly low-cost, compute charges continue to build whenever the VM stays allocated, and managed disk costs continue even when the VM is deallocated. Because of that, the final cost depends a lot on how long the VM stays online and how closely usage matches the original estimate. In a larger real-world deployment, the monthly cost would likely increase through longer VM uptime, additional managed services, higher log ingestion, and any future domain or DNS-related additions. | The biggest cost driver in this deployment was expected to be the VM itself, especially the compute and storage tied to it. Even though the VM size was still fairly low-cost, compute charges continue to build whenever the VM stays allocated, and managed disk costs continue even when the VM is deallocated. Because of that, the final cost depends a lot on how long the VM stays online and how closely usage matches the original estimate. In a larger real-world deployment, the monthly cost would likely increase through longer VM uptime, additional managed services, higher log ingestion, and any future domain or DNS-related additions. | ||
[[File:Imaged3.png|thumb|669x669px|Nmap scan result showing port 443/tcp as filtered, confirming the Azure NSG deny rule was successfully enforced]] | |||
Below is a screenshot of the cost management cost analysis for my resource group for this project: | Below is a screenshot of the cost management cost analysis for my resource group for this project: | ||
| Line 316: | Line 318: | ||
== Issue 2: MediaWiki Skin Configuration Problem == | == Issue 2: MediaWiki Skin Configuration Problem == | ||
'''Symptom:''' | '''Symptom:''' | ||
[[File:Image45.png|thumb|757x757px|Azure DNS zone and record sets configured to resolve crux.silkspace.net to the Project Crux VM's public IP]] | |||
After MediaWiki was installed, the site loaded, but the theme or skin configuration was incorrect. The wiki did not display properly using the intended default skin, which made the site appear broken even though the installation itself had succeeded. | After MediaWiki was installed, the site loaded, but the theme or skin configuration was incorrect. The wiki did not display properly using the intended default skin, which made the site appear broken even though the installation itself had succeeded. | ||
| Line 345: | Line 347: | ||
Verification: | Verification: | ||
[[File:Imager23.png|thumb|549x549px|Browser security information confirming the custom domain is secured over HTTPS with a valid Let’s Encrypt certificate]] | |||
I verified the fix in two ways: | I verified the fix in two ways: | ||
| Line 386: | Line 388: | ||
= Reflection = | = Reflection = | ||
[[File:Image32.png|thumb|939x939px|The Azure Monitor alert rules configured for the virtual machine, including CPU and disk alerts]] | |||
== Technical Reflection == | == Technical Reflection == | ||
| Line 406: | Line 409: | ||
Public IP: 20.3.236.215 | Public IP: 20.3.236.215 | ||
[[File:Image3212.png|thumb|975x975px|Log Analytics results for the KQL query identifying failed SSH login activity and authentication attempts]] | |||
This project was submitted using a live CMS-hosted documentation model. The MediaWiki site contains the hosted Ops Brief content, while the Azure Dashboard link provides the operational evidence required by the project rubric. | This project was submitted using a live CMS-hosted documentation model. The MediaWiki site contains the hosted Ops Brief content, while the Azure Dashboard link provides the operational evidence required by the project rubric. | ||
| Line 422: | Line 425: | ||
[4] Microsoft, “Quickstart: Create a Linux virtual machine in the Azure portal,” ''Microsoft Learn'', Feb. 6, 2026. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal</nowiki>. Accessed: Mar. 29, 2026. | [4] Microsoft, “Quickstart: Create a Linux virtual machine in the Azure portal,” ''Microsoft Learn'', Feb. 6, 2026. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal</nowiki>. Accessed: Mar. 29, 2026. | ||
[[File:Image123.png|thumb|687x687px|Log Analytics results for the KQL query monitoring critical administrative and service activity, such as SSH, Apache, and MariaDB events]] | |||
[5] Microsoft, “Create and use an SSH key pair for Linux VMs in Azure,” ''Microsoft Learn'', Oct. 16, 2024. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/virtual-machines/linux/mac-create-ssh-keys</nowiki>. Accessed: Mar. 29, 2026. | [5] Microsoft, “Create and use an SSH key pair for Linux VMs in Azure,” ''Microsoft Learn'', Oct. 16, 2024. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/virtual-machines/linux/mac-create-ssh-keys</nowiki>. Accessed: Mar. 29, 2026. | ||
| Line 458: | Line 461: | ||
[22] Microsoft, “Quickstart: Start using Cost Analysis,” ''Microsoft Learn'', Jul. 1, 2025. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/quick-acm-cost-analysis</nowiki>. Accessed: Mar. 29, 2026. | [22] Microsoft, “Quickstart: Start using Cost Analysis,” ''Microsoft Learn'', Jul. 1, 2025. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/quick-acm-cost-analysis</nowiki>. Accessed: Mar. 29, 2026. | ||
[[File:Image4576.png|thumb|975x975px|Cost management analysis for the project's resource group, breaking down the expenses of the deployed services]] | |||
[23] Microsoft, “Microsoft Entra Connect: Prerequisites and hardware,” ''Microsoft Learn'', Jan. 16, 2026. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-install-prerequisites</nowiki>. Accessed: Mar. 29, 2026. | [23] Microsoft, “Microsoft Entra Connect: Prerequisites and hardware,” ''Microsoft Learn'', Jan. 16, 2026. [Online]. Available: <nowiki>https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-install-prerequisites</nowiki>. Accessed: Mar. 29, 2026. | ||
| Line 468: | Line 471: | ||
[27] OpenAI, “ChatGPT,” AI language model, used as an interactive assistant for project troubleshooting, planning, and documentation support, Mar. 2026. [Online]. Available: <nowiki>https://chatgpt.com/</nowiki>. Accessed: Mar. 29, 2026. | [27] OpenAI, “ChatGPT,” AI language model, used as an interactive assistant for project troubleshooting, planning, and documentation support, Mar. 2026. [Online]. Available: <nowiki>https://chatgpt.com/</nowiki>. Accessed: Mar. 29, 2026. | ||
[[File:Image4t656.png|thumb|752x752px|The custom "crux-ops-dashboard" built in Azure to provide operational visibility and live infrastructure monitoring]] | |||
[28] Google, “Gemini,” AI language model, used as an interactive assistant for supplementary troubleshooting and documentation support, Mar. 2026. [Online]. Available: <nowiki>https://gemini.google.com/</nowiki>. Accessed: Mar. 29, 2026. | [28] Google, “Gemini,” AI language model, used as an interactive assistant for supplementary troubleshooting and documentation support, Mar. 2026. [Online]. Available: <nowiki>https://gemini.google.com/</nowiki>. Accessed: Mar. 29, 2026. | ||
Latest revision as of 04:30, 1 April 2026
Introduction
Project Crux is a Linux-based LAMP stack deployment on Microsoft Azure focused on building, securing, monitoring, and documenting a working cloud-hosted web application environment.
In this document, I will go over how I deployed and configured the Azure virtual machine, networking, Network Security Groups, Apache, MariaDB, PHP, and MediaWiki, along with HTTPS, Azure Monitor, Log Analytics, alerting, Azure Key Vault, cost tracking, and the troubleshooting I had to do throughout the build.
The final result was a low-cost, operations-focused Azure environment that emphasized security, monitoring, documentation, and budget awareness.
Project Overview and Architecture Diagram
I deployed the infrastructure in its own Azure Resource Group and set it up in a virtual network with both a public subnet and a private subnet. The public subnet is where I placed the VM, and I only allowed the ports that were required for management and web access. The private subnet was included to reflect better network design and to leave room for separating services later if the environment were expanded.
For security, I used NSGs, UFW, SSH key authentication, HTTPS, database hardening, and Azure Key Vault for handling credentials. For monitoring and visibility, I set up Azure Monitor, a Log Analytics Workspace, the Azure Monitor Agent, a Data Collection Rule, alert rules, and a custom Azure Dashboard. I also created a cost alert rule to notify me if the total cost of the build went over $50.
Architecture details:
• Resource Group: rg-crux
• Azure Region: West US 2
• VNet CIDR: 10.40.0.0/16
• Public Subnet CIDR: 10.40.1.0/24
• Private Subnet CIDR: 10.40.2.0/24
• VM Name: vm-crux
• Static Public IP: 20.3.236.215
• Private IP: 10.40.1.4
• CMS URL: https://crux.silkspace.net/index.php/Main_Page
The screenshot below shows the Project Crux Azure architecture, including segmented networking, VM placement, Key Vault integration, and monitoring components.

Design Decisions
Decision 1) CMS Selection: MediaWiki Instead of WordPress or Ghost
I chose MediaWiki as the platform for this project because it made the most sense for what Project Crux was actually asking for. A big part of the project was documentation, troubleshooting, and showing the design and build process, so I wanted something that worked well for technical writeups instead of just looking polished. MediaWiki fit that better than WordPress or Ghost because it is built around pages, internal links, and documentation-style content. Since I also had to host my documentation on the deployed application, MediaWiki made the whole project feel more connected because the application itself became part of the evidence.
I did consider WordPress because it is more familiar and can be made to look better more quickly. I decided against it because most of its strengths come from themes and plugins, which I did not really need for a project focused on technical documentation instead of a business or portfolio website. I also considered Ghost, but I ruled it out because it adds another layer of complexity with Node.js, and I did not want to create extra troubleshooting work for myself when MediaWiki already fit the project better.
In the end, I picked MediaWiki because it matched the requirements better and because it was the platform I was already the most familiar with.
Decision 2) Public and Private Subnets in a Single VNet
I chose to build the project inside a single VNet with separate public and private subnets because it met the project requirements while also reflecting a more realistic cloud network design. The VM was placed in the public-facing subnet so it could be reached over HTTPS and managed through SSH from my own IP address only. The second subnet was kept private and reserved for future service separation, such as moving the database or other internal services off the web server if the environment were expanded later.
I did consider building the network with only a single subnet, since that would have been the easiest minimum setup. I decided against that because it would have met the bare minimum without really showing proper segmentation or better design practice. I also considered going further with additional Azure services and more separation, but I ruled that out because this project was meant to stay low-cost and centered around a single VM deployment.
Below is a screenshot showing the two subnets that were created for the environment.

Decision 3) Key Vault Instead of Plain-Text Credential Storage
I chose to store sensitive values in Azure Key Vault instead of just leaving them in config files or local notes on the VM. Even though this was still a small student deployment, putting the database and CMS credentials into Key Vault was a cleaner and more secure way to handle secrets. It also created better separation between the infrastructure, the application, and the credentials being used to support it.
I could have just left the database password inside the MediaWiki configuration and written the rest down in my own notes, which would have been faster in the short term. I decided against that because Project Crux places a clear emphasis on secrets management, and using Key Vault did a better job of showing that I understood how to handle credentials properly in Azure.
Key Vault access was managed using Azure role-based access control (RBAC) rather than legacy vault access policies. Since the vault was configured for RBAC, the Access policies blade was intentionally unavailable. To manage the secrets, I assigned myself the permissions needed through Access control (IAM), which allowed me to create, view, and manage the stored credentials in the vault.

In the end, I used Key Vault to store the MediaWiki and database-related passwords instead of leaving them only on the VM. Below is a screenshot showing the secrets that were saved in the vault.
Programmatic Secret Retrieval Method
To retrieve secrets programmatically I would use the Azure CLI to pull a stored secret, such as the MediaWiki database password, from Azure Key Vault.
· az login
· az keyvault secret show --vault-name kv-crux --name wiki-db-password --query value -o tsv
Decision 4) Single B1s VM with Local MariaDB Instead of Managed Database Services
I chose to keep the whole LAMP stack on one VM and run the database locally on that same server. This matched the low-cost design of Project Crux and avoided spending extra money on managed database services. Since the project requirements allowed a local MySQL or MariaDB instance and focused more on operational practices than building out a bunch of separate services, a single-server setup made the most sense.
I did consider using an Azure-managed database service because it would have separated the application and database more cleanly. I decided against that because it would have added cost, increased the number of resources I had to manage, and taken time away from the monitoring, identity, and documentation parts of the project, which were more important for this build.
In the end, keeping MediaWiki and MariaDB on the same VM was the simplest and most practical choice for this project.
Security Posture
Security in this deployment was handled using multiple layers instead of relying on one setting alone. At the Azure network layer, I created NSG rules that allowed only the traffic required for management and application access. SSH was restricted to my own source IP instead of being left open to the internet, while HTTP and HTTPS were allowed for web access. Everything else remained blocked by default. This reduced the exposed attack surface and kept the VM aligned with the project requirement that SSH must not be open to 0.0.0.0/0.

At the operating system layer, I hardened SSH by disabling password authentication and disabling root login, which left key-based authentication as the only allowed sign-in method. I also created and used a non-root administrative user for remote access instead of managing the server directly as root. In addition to the Azure NSG, I enabled UFW as a host-based firewall to provide another layer of protection in case a cloud-side rule was changed or misconfigured later.
For the web application stack, HTTPS was configured so the site would not be served only over unencrypted HTTP, and HTTP was redirected to HTTPS to enforce encrypted access. On the database side, MariaDB was bound to 127.0.0.1 so it would not listen publicly on the VM’s network interfaces. MediaWiki file permissions were also configured so the web content was owned and served properly without being left overly permissive. Finally, the system was fully patched before final submission, and the patch date was recorded as part of the final security state of the server.
Security information
Patch date: 2026-03-28
SSH authentication mode: Key-only
Root login: Disabled
Host firewall: UFW active, allowing 22/tcp, 80/tcp, and 443/tcp
Database bind address: 127.0.0.1
TLS approach: Self-signed
Allowed inbound ports: 22/tcp, 80/tcp, 443/tcp
The screenshot below shows the public NSG and the security rules that were applied to it.
The screenshot below shows the VNet configured for this project, including the proper network layout and security-related settings.
Nmap Security Verification
An external Nmap scan was used to verify that the VM’s exposed services matched the intended security design. The goal was to confirm that only the required ports were reachable from outside the environment and that unnecessary services were not exposed publicly. Since the web server needed to support administration and application access, the expected result was that SSH, HTTP, and HTTPS would appear reachable under normal conditions, while the MariaDB service would remain inaccessible externally because it was bound to localhost on the VM.
The supporting screenshots also show that the database was listening only on 127.0.0.1:3306, that UFW was active, and that SSH password authentication and root login were disabled. This established the expected internal security posture before validating the external attack surface with Nmap.
Nmap Interpretation

The final Nmap scans showed that the exposed network surface matched the intended design. Under normal configuration, only ports 22/tcp, 80/tcp, and 443/tcp were visible externally, which aligned with the required management and web access for the project. No database service was exposed publicly, which confirmed that MariaDB was correctly kept local to the VM and bound to localhost only. I also tested the HTTPS rule by temporarily changing the NSG rule from Allow to Deny and then scanning port 443 specifically. Nmap reported the port as filtered, which confirmed that the NSG rule change was being enforced as expected. Overall, the scans showed that the NSG and host firewall configuration matched the intended security posture and did not leave unnecessary services reachable from the internet.
The screenshots below show both the internal validation of the VM’s local security posture and the external Nmap scans used to verify that only the intended services were reachable from the internet.
In the screenshot below, MariaDB is shown listening only on 127.0.0.1:3306, UFW is active with rules allowing only the required inbound ports, and SSH is configured with permitrootlogin no and passwordauthentication no.
In the screenshot below, Nmap identified only the expected externally reachable services: 22/tcp (SSH), 80/tcp (HTTP), and 443/tcp (HTTPS). The scan also confirmed the presence of Apache and the Ubuntu-based SSH service, while all other tested TCP ports were filtered.
In the screenshot below, a basic Nmap scan confirms that only ports 22, 80, and 443 are open, while the remaining scanned ports are filtered or not reachable.
In the screenshot below, the Allow-HTTPS rule was temporarily changed so that inbound TCP 443 traffic would be denied. This was done to verify that Azure NSG changes were actually affecting external exposure as expected.
In the screenshot below, Nmap shows 443/tcp as filtered after the NSG rule was changed to deny HTTPS traffic. This confirmed that the NSG was enforcing the rule correctly from an external point of view.
Custom Domain and Azure DNS Bonus
As an additional enhancement, I configured a custom domain for the Project Crux environment using Azure DNS and secured it with a Let’s Encrypt certificate. Instead of leaving the MediaWiki site accessible only by public IP, I set up the subdomain crux.silkspace.net so the application could be reached through a proper hostname. This made the deployment cleaner, easier to present, and more realistic from an operational point of view.
To do this, I created the DNS zone in Azure and configured the required record set so the custom domain would resolve to the public IP of the Azure VM. This allowed the site to be reached by name rather than only by IP address. After DNS resolution was working, I used Certbot to request and install a Let’s Encrypt certificate for the subdomain so the site could be served securely over HTTPS.
This bonus configuration demonstrated that the environment was not only functional, but also capable of supporting a more production-like DNS and TLS setup. It also showed that the MediaWiki site, Azure DNS configuration, and certificate deployment were all working together as part of the final Project Crux build.
The screenshot below shows the Azure DNS zone and record set used to make crux.silkspace.net resolve to the Project Crux VM.

The screenshot below shows the Let’s Encrypt certificate issued for crux.silkspace.net, confirming that the site was secured over HTTPS with a valid certificate.
Monitoring and KQL
For this project, monitoring was set up to show that the VM was not just deployed, but that I could actually see what was going on with it from an operations point of view. Boot Diagnostics was enabled on the VM so I would still have a way to troubleshoot startup problems if the server ever became unreachable through normal access. Azure Monitor was then used for live infrastructure visibility, while a Log Analytics Workspace was created to centralize log and performance data. The VM was connected to that workspace through the Azure Monitor Agent, and a Data Collection Rule was configured to ingest both Syslog data and performance counters.
I configured two alert rules: one for CPU usage and one for disk performance. These alerts were tied to an email-based Action Group so important events would send notifications instead of just sitting in the Azure portal unnoticed. I picked thresholds that were high enough to avoid constant noise, but still low enough to warn me before resource usage turned into an actual performance problem.
For the KQL part, I configured two queries that would let me monitor the environment more efficiently, and I pinned them to my dashboard so they could be easily accessed and reviewed.
Alert Rule 1: CPU Alert
- Signal: Percentage CPU
- Threshold: Greater than 80
- Evaluation period: 5 minutes
- Action Group: ag-crux / hudson.silk@itas.ca
- Why this threshold was chosen: I set the CPU alert to trigger when utilization exceeds 80% so I would be notified before sustained high CPU usage caused noticeable performance issues on the VM.
Alert Rule 2: Disk Alert
- Signal: Data Disk IOPS Consumed Percentage
- Threshold: Greater than 95
- Evaluation period: 5 minutes
- Action Group: ag-crux / hudson.silk@itas.ca
- Why this threshold was chosen: I configured a disk performance alert so I could detect heavy storage activity before disk contention affected MediaWiki or MariaDB performance.

Below is a screenshot showing the alert rules I created while setting up monitoring and learning how Azure alerting worked.
KQL Query 1: Failed SSH Login Activity
This query was created to identify failed SSH login attempts and help detect brute-force or repeated access attempts against the VM. In a real environment, a concentration of failed login activity from a single source would indicate either unauthorized probing or a misconfigured client attempting to authenticate repeatedly.
Syslog
| where TimeGenerated > ago(24h)
| where Facility in ("auth","authpriv")
| where ProcessName == "sshd" or SyslogMessage has "sshd"
| where SyslogMessage has_any (
"Failed publickey",
"invalid user",
"authentication failure",
"Connection closed by authenticating user",
"maximum authentication attempts exceeded"
)
| extend SourceIP = coalesce(
extract(@"\bfrom ([0-9]{1,3}(?:\.[0-9]{1,3}){3})\b", 1, SyslogMessage),
extract(@"\b([0-9]{1,3}(?:\.[0-9]{1,3}){3})\b", 1, SyslogMessage)
)
| summarize Attempts=count(), SampleMessages=make_set(SyslogMessage, 3) by SourceIP, Computer, bin(TimeGenerated, 1h)
| order by Attempts desc
KQL Query 2: Critical admin and service activity

This query shows recent administrative and service-level activity affecting the VM, including SSH, sudo usage, Apache, and MariaDB events. I would use it during an incident to quickly see whether a service failed, restarted, or was modified by an administrator while troubleshooting availability problems.
Syslog
| where TimeGenerated > ago(24h)
| where Facility in ("auth","daemon")
| where SyslogMessage has_any (
"sudo",
"sshd",
"apache2",
"mariadb",
"mysqld",
"Started",
"Stopped",
"Starting",
"Stopping",
"Reloaded",
"Failed",
"error"
)
| project TimeGenerated, Computer, Facility, ProcessName, SeverityLevel, SyslogMessage

| order by TimeGenerated desc
Cost Report
During the project, I kept costs down by only deploying the Azure resources that were necessary and avoiding unnecessary services. Before deployment, I used Microsoft’s pricing calculator to estimate the cost of the build, and that estimate is reflected in the service-by-service cost breakdown shown below.
Service-by-Service Cost Breakdown
| Service | Estimated Cost (monthly) | Actual Cost (To date) | Notes |
| Virtual Machine (originally B1s, later resized to B2als v2) | CA$8.11 | CA$0.90 | Cost shown for vm-crux. The VM was later resized during troubleshooting. |
| Managed Disk | CA$6.57 | CA$0.39 | Cost shown for vm-crux_osdisk_1_... |
| Static Public IP | CA$0.15 | CA$0.41 | Combined cost of vm-crux-ip (CA$0.29) and vnet-crux-ipv4 (CA$0.12). |
| Log Analytics Workspace | CA$0.83 | CA$0.00 | law-crux showed no billed cost at the time of capture. |
| Azure Monitor Alerts | CA$0.82 | CA$0.00 | alert-cpu-crux, disk-alert-crux, mem-test-crux, and ag-crux all showed CA$0.00. |
| Azure Key Vault | CA$0.04 | <CA$0.01 | kv-crux-hudson showed less than one cent in billed cost. |
| Microsoft Entra ID | CA$0.00 | CA$0.00 | No separate billed Entra ID resource cost was shown in the rg-crux view. |
| Total | CA$18.03 | CA$1.70 | Remaining credit: CA$135 |
Final Cost
After completing all the technical setup and configuration the final cost of the deployment $1.70
Cost Analysis
The biggest cost driver in this deployment was expected to be the VM itself, especially the compute and storage tied to it. Even though the VM size was still fairly low-cost, compute charges continue to build whenever the VM stays allocated, and managed disk costs continue even when the VM is deallocated. Because of that, the final cost depends a lot on how long the VM stays online and how closely usage matches the original estimate. In a larger real-world deployment, the monthly cost would likely increase through longer VM uptime, additional managed services, higher log ingestion, and any future domain or DNS-related additions.

Below is a screenshot of the cost management cost analysis for my resource group for this project:
Troubleshooting Log
This project did not deploy cleanly on the first attempt, and several issues had to be diagnosed and corrected before the environment was fully functional. Below is the layout of all of the problems
Issue 1: Azure Region Restrictions and Resource Placement
Symptom:
At the start of the project, some Azure resources could not be created in the intended region, and Azure did not clearly state which regions were allowed under the student subscription. As a result, I ended up creating resources in multiple different regions simply by selecting whichever regions would accept the deployment. This caused a design problem later because resources that were meant to work together were not always being deployed in the same location.
Root cause:
The underlying problem was that the Azure for Students subscription was restricted by a deployment policy that only allowed certain regions, but the portal did not make that very clear during the initial setup. Because of that, I was troubleshooting deployment failures while also trying to determine which regions were actually permitted. This led to a fragmented environment where some resources were created in different locations instead of being kept in one consistent region.
Resolution:
I reviewed the failed deployment and policy messages, identified which region would consistently accept the required services, and standardized the project environment in West US 2. After that, I rebuilt or re-created the required resources in the same region so that the VM, VNet, NSGs, monitoring components, Key Vault, and related services were aligned properly.
Verification:
I confirmed the fix by checking the Azure portal and verifying that the main project resources were all deployed in West US 2 and were able to reference each other correctly without region mismatch issues.
Issue 2: MediaWiki Skin Configuration Problem
Symptom:

After MediaWiki was installed, the site loaded, but the theme or skin configuration was incorrect. The wiki did not display properly using the intended default skin, which made the site appear broken even though the installation itself had succeeded.
Root cause:
The problem was caused by the MediaWiki configuration referencing a skin that was not properly loaded or enabled in the application configuration. In other words, the issue was not that MediaWiki had failed to install, but that the skin configuration in the settings file did not match the installed and enabled skins.
Resolution:
I went into the MediaWiki configuration file and changed the skin configuration so that MediaWiki would load a valid installed skin instead. This corrected the display issue and allowed the site to render normally.
Verification:
I verified the fix by reloading the MediaWiki site in the browser and confirming that the main page loaded correctly and the interface displayed normally.
Issue 3: SSH Access Lost Due to Incorrect NSG Rule
Symptom:
At one point during the build, I completely lost SSH access to the VM. The site also became difficult to troubleshoot because I could no longer reliably connect to the server remotely. Azure diagnostics showed that inbound SSH traffic was being denied.
Root cause:
This problem was caused by an incorrectly configured Network Security Group (NSG) rule. My SSH allow rule was configured with port 22 in the source port field, which meant the rule never matched a normal SSH connection. Since the rule did not match, Azure continued evaluating the NSG and eventually applied the default DenyAllInBound rule.
Resolution:
I corrected the NSG rule by setting the source port range to Any and keeping destination port 22 as the SSH target. That allowed Azure to correctly match inbound SSH traffic. I then re-tested connectivity and confirmed that the rule was finally being applied as intended.
Verification:

I verified the fix in two ways:
- Azure Network Watcher diagnostics no longer showed the SSH connection being blocked by the default deny rule.
- I was able to successfully reconnect to the VM over SSH using my private key.
Issue 4: VM Instability Caused by Undersized Initial VM
Symptom:
Later in the project, the VM became unstable and unresponsive. SSH access became unreliable, the website would time out, and the server appeared to freeze under load. This made troubleshooting difficult because the machine was technically running but was not consistently responding to connections.
Root cause:
The original VM sizing was too weak once the project environment was actually in use. A nearly empty Linux VM worked acceptably at the start, but once I installed Apache, MariaDB, MediaWiki, and Azure monitoring components, the resource demands increased enough that the server became unstable. The issue was not obvious at the beginning because the machine had very little workload on it at first.
Resolution:
I resized the VM to provide more resources. After increasing the available CPU and RAM, the system became responsive again and normal management access returned. This allowed me to continue the project and complete the remaining configuration tasks.
Verification:
I verified the fix by successfully reconnecting to the VM over SSH, confirming that the MediaWiki site became reachable again, and observing that the server stopped freezing under normal project workload.
Issue 5: Monitoring and Log Query Troubleshooting
Symptom:
During monitoring setup, my initial KQL queries for failed SSH sign-ins returned no results even after I intentionally generated failed SSH connection attempts.
Root cause:
The issue was that my original query was written to look for failed password-based SSH logins, but my server was configured for key-only SSH authentication. Because password authentication was disabled, those specific log patterns did not exist in the logs, even though failed authentication attempts were still occurring.
Resolution:
I adjusted the KQL queries to better match the actual authentication method used on the server. Instead of searching only for failed password events, I broadened the logic to look for relevant SSH authentication failures associated with a key-based configuration and verified that the Log Analytics workspace was receiving Syslog data from the VM.
Verification:
I confirmed the fix by successfully running Syslog queries in Log Analytics, identifying SSH-related events in the logs, and pinning meaningful KQL output to the Azure dashboard.
Reflection

Technical Reflection
If I were to rebuild this environment, the biggest technical change I would make would be planning the infrastructure around the actual workload earlier instead of just the minimum starting design. The original single-VM setup made sense for keeping the build simple and low-cost, but once Apache, MariaDB, MediaWiki, and monitoring were all running together, that design started to show its limits. I also would have standardized the region and deployment choices much earlier, because a lot of wasted time came from Azure not clearly showing which regions were actually available under the subscription. The one-VM approach was fine for this project, but if this were expanded into a more serious deployment, I would separate roles earlier and avoid putting everything on one system.
Process Reflection
From a process point of view, the biggest lesson from this project was that documentation, screenshots, and proof collection need to happen while the work is being done, not after. A lot of the project was not just about getting the environment working, but being able to explain what I built, why I built it that way, and how I fixed problems along the way. I also learned that I should have validated important things earlier, like region restrictions, tenant access because those ended up slowing me down later. If I were doing this again, I would keep a tighter running log of configuration choices, screenshots, and troubleshooting steps as they happened so the final writeup would be easier and more accurate.
Azure Dashboard
The Azure Dashboard was created as the operational summary view for the environment and used as a live supplement to the hosted Ops Brief. It included all of the required panels from the revised Project Crux deliverables, including VM CPU and memory, alert rule status, cost to date versus budget, Key Vault operations, a KQL query result, and the resource group overview. This dashboard was used to confirm that the project environment was not only deployed, but also actively observable and manageable from within Azure.
Building the dashboard ended up being one of the more frustrating parts of the project because the Pin to dashboard option was not always located where I expected it to be. In many cases, the resources or graphs could not be added directly from the dashboard edit view and instead had to be pinned from the individual Azure service page itself. For example, the cost panel had to be pinned from the Cost Analysis page after filtering to the correct resource group. This made the process more difficult than expected, especially because a lot of Microsoft’s online documentation and older guides no longer matched the current Azure portal layout. Since the Azure interface has changed significantly over time, finding the correct buttons and workflows for pinning certain tiles took a fair amount of trial and error.
Dashboard name: crux-ops-dashboard
The screenshot below shows the Azure Dashboard that was shared with the instructor for review.
Instructor Access Information
Public application URL: https://crux.silkspace.net/index.php/Main_Page
Public IP: 20.3.236.215

This project was submitted using a live CMS-hosted documentation model. The MediaWiki site contains the hosted Ops Brief content, while the Azure Dashboard link provides the operational evidence required by the project rubric.
Conclusion
Project Crux gave me practical experience in building a cloud workload that had to be secured, monitored, documented, and justified instead of just spun up and left running. The final environment brought together core Azure infrastructure, Linux administration, web application hosting, secrets management, monitoring, and hybrid identity into one working deployment.
MediaWiki also turned out to be a good fit because it let the documentation live inside the application itself, which matched the project’s focus on operational understanding and knowledge sharing. Overall, this project reinforced that successful cloud work is not just about provisioning resources, but about being able to show that the environment is secure, observable, explainable, and responsibly managed.
References
[1] ITAS 267, “Project Crux — Azure Application,” course assignment handout, Mar. 1, 2026.
[2] ITAS 267, “Project Crux Rubric: Azure Ascent — Cloud Infrastructure & Identity,” course rubric, Mar. 2026.
[3] ITAS 267, “Project Crux revised documentation and dashboard requirements,” course update handout, Mar. 2026.
[4] Microsoft, “Quickstart: Create a Linux virtual machine in the Azure portal,” Microsoft Learn, Feb. 6, 2026. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal. Accessed: Mar. 29, 2026.

[5] Microsoft, “Create and use an SSH key pair for Linux VMs in Azure,” Microsoft Learn, Oct. 16, 2024. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/mac-create-ssh-keys. Accessed: Mar. 29, 2026.
[6] Microsoft, “Use SSH keys to connect to Linux VMs,” Microsoft Learn, Oct. 16, 2024. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/ssh-from-windows. Accessed: Mar. 29, 2026.
[7] Microsoft, “Azure network security groups overview,” Microsoft Learn, Jul. 15, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-network/network-security-groups-overview. Accessed: Mar. 29, 2026.
[8] Microsoft, “Create, change, or delete Azure network security groups,” Microsoft Learn, Jul. 26, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-network/manage-network-security-group. Accessed: Mar. 29, 2026.
[9] Microsoft, “Change the size of a virtual machine,” Microsoft Learn, Feb. 5, 2026. [Online]. Available: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/resize-vm. Accessed: Mar. 29, 2026.
[10] MediaWiki Contributors, “Manual:Installing MediaWiki,” MediaWiki.org, Mar. 19, 2026. [Online]. Available: https://www.mediawiki.org/wiki/Manual:Installing_MediaWiki. Accessed: Mar. 29, 2026.
[11] MediaWiki Contributors, “Manual:Skin configuration,” MediaWiki.org. [Online]. Available: https://www.mediawiki.org/wiki/Manual:Skin_configuration. Accessed: Mar. 29, 2026.
[12] Microsoft, “Create or edit a metric alert rule,” Microsoft Learn, Nov. 18, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-create-metric-alert-rule. Accessed: Mar. 29, 2026.
[13] Microsoft, “Collect log data from virtual machines with Azure Monitor,” Microsoft Learn. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-monitor/vm/data-collection. Accessed: Mar. 29, 2026.
[14] Microsoft, “Collect Syslog events with Azure Monitor Agent,” Microsoft Learn, Mar. 3, 2026. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-monitor/vm/data-collection-syslog. Accessed: Mar. 29, 2026.
[15] Microsoft, “Create and share dashboards that visualize data in Azure Monitor,” Microsoft Learn, Jan. 23, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-monitor/visualize/tutorial-logs-dashboards. Accessed: Mar. 29, 2026.
[16] Microsoft, “Create a dashboard in the Azure portal,” Microsoft Learn, Dec. 16, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-portal/azure-portal-dashboards. Accessed: Mar. 29, 2026.
[17] Microsoft, “Share Azure portal dashboards by using Azure role-based access control,” Microsoft Learn, Jun. 19, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/azure-portal/azure-portal-dashboard-share-access. Accessed: Mar. 29, 2026.
[18] Microsoft, “Quickstart: Run Resource Graph query using Azure portal,” Microsoft Learn, Apr. 23, 2024. [Online]. Available: https://learn.microsoft.com/en-us/azure/governance/resource-graph/first-query-portal. Accessed: Mar. 29, 2026.
[19] Microsoft, “Grant permission to applications to access an Azure key vault using Azure role-based access control,” Microsoft Learn, Mar. 25, 2026. [Online]. Available: https://learn.microsoft.com/en-us/azure/key-vault/general/rbac-guide. Accessed: Mar. 29, 2026.
[20] Microsoft, “Azure role-based access control (Azure RBAC) vs. access policies (legacy),” Microsoft Learn, Mar. 23, 2026. [Online]. Available: https://learn.microsoft.com/en-us/azure/key-vault/general/rbac-access-policy. Accessed: Mar. 29, 2026.
[21] Microsoft, “Estimate costs with the Azure pricing calculator,” Microsoft Learn, Jul. 21, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/pricing-calculator. Accessed: Mar. 29, 2026.
[22] Microsoft, “Quickstart: Start using Cost Analysis,” Microsoft Learn, Jul. 1, 2025. [Online]. Available: https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/quick-acm-cost-analysis. Accessed: Mar. 29, 2026.

[23] Microsoft, “Microsoft Entra Connect: Prerequisites and hardware,” Microsoft Learn, Jan. 16, 2026. [Online]. Available: https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-install-prerequisites. Accessed: Mar. 29, 2026.
[24] Microsoft, “Customize an installation of Microsoft Entra Connect,” Microsoft Learn, Apr. 9, 2025. [Online]. Available: https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-install-custom. Accessed: Mar. 29, 2026.
[25] Microsoft, “Microsoft Entra Connect Sync: Configure filtering,” Microsoft Learn, Apr. 9, 2025. [Online]. Available: https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-sync-configure-filtering. Accessed: Mar. 29, 2026.
[26] Microsoft, “Implement password hash synchronization with Microsoft Entra Connect Sync,” Microsoft Learn, Dec. 3, 2025. [Online]. Available: https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-password-hash-synchronization. Accessed: Mar. 29, 2026.
[27] OpenAI, “ChatGPT,” AI language model, used as an interactive assistant for project troubleshooting, planning, and documentation support, Mar. 2026. [Online]. Available: https://chatgpt.com/. Accessed: Mar. 29, 2026.

[28] Google, “Gemini,” AI language model, used as an interactive assistant for supplementary troubleshooting and documentation support, Mar. 2026. [Online]. Available: https://gemini.google.com/. Accessed: Mar. 29, 2026.