← Lakitu

Load Balancing Operations

Responsibilities and Areas of Ownership of F5 platforms

General escalation

Establishing that a team rather than an individual is the place for engineering or operational staff to engage for operational F5 needs.

Alerts

Expectation: Daily awareness of any operational alerts or notifications from the F5 systems.

  • Respond to snmp alert emails, or automatically triggered alerts in Slack/Teams/PagerDuty
    • including, but not limited to: low disk space on DCDs to expand drives as needed, blade failures and unexpected failovers, DNSSEC key rollover
  • Verifying that SNMP traps from F5 DNS are working; (monthly)
  • Establish which alerts drive the most value to tune out white noise and make sure the alerts that come through are actionable and valueable.
  • Can these alerts be sent to pagerduty or a pagerduty intermediary system to notify the team.

Upgrades/Patches

F5 Software Lifecycle

Expectation:

  • All F5s need to be running current major release(x.) or 1 major release previous.
  • All F5s should be running the current LTS(x.x.0) release for the associated major release within 2 month.
    • Currently we wait until the first maintenance release (x.x.1) before updating, however, moving forward we’d like to update even to x.x.0 and outline a rollback proceedure if necessary.
  • pre-prod environments can be updated during business hours with notification to development and operations
  • production environments can be updated during business hours in the offline data center as following caveats for each cluster. Reference procedural documents.
  • Maintain a list of all production and preproduction F5 devices/platforms to include; current version, currently available LTS, an order of priority for upgrades, an estimation of the release cycle when a new LTS will be release(i.e. quarterly, monthly on x day), the date when any device certificates will expire. If a service stakeholder can be identified then those should be listed as well.
  • Knowing the release cycle we should begin planning and communicating about upgrades even before they are released.

F5 BIG-IP

Frequency: As needed for bug fixes, security patches or major version at x.1 (LTS) release
Expected Time to complete: 3 hours(excluding planning/coordination) per cluster

F5 BIG-IQ

Frequency: As needed for bug fixes, security patches - x.1+ releases Expected time to complete: 24-48 hours

F5 cert management

  • BIG-IP device certs expire March (yearly)
  • BIG-IQ device certs expire Feb (yearly)
  • Instructions for F5 device certs are in Lakitu SharePoint space (Migrate these to obsidian)

Expected time to complete: a day or two

Capacity Management

Monitor and evaluate current and future capacity needs for TM, ESI and NT.

  • Submit stats to soc.jackhenry.com dashboard on the first Monday of the month. Time to complete - 1hr
  • Attend monthly capacity management meetings for IS
  • As needed attend Banno scale meetings; i.e. if you identify an issue and want to bring up the impact or if a team lead knows of a pertinent topic that will be covered.
  • Jason has instructions in Obsidian vault, metrics spreadsheet in Lakitu SharePoint space, IMS sends NT monthly reports on 1st day of month. TM dashboard created, but we don’t report on TM at the moment

Security

  • Review configuration to ensure we are following current CIS benchmarks or vendor best practices for security.
  • Keep relevant FFIEC documentation about these standards, their review, and validation.

Network

  • manage network and interface build out as provided by architectural diagrams and project needs.
  • to include engaging Corp Tech Services and ICS teams for their area of responsibility of the buildout.

DNS

With the exception of DNS hosted that requires GTM for Digital purposes;

  • Transition all F5 based authoritative DNS to UltraDNS or AWS route 53 systems managed by Launch Control and CMS T2 support.
    • In the long term future we may consider adopting CloudFlare DNS
  • Identify internal owners and customer list to coordinate migration and communicate when we will discontinue DNS services for other teams on the F5.

DNSSEC

Establish valid Login in to 101domain.com to update delegation signer record at registrar

  • Key Signing Key
    • Respond to snmp alert email when the KSK rolls over and open a case for Ground Control? to create a new DS record on the registrar portal

Expected time to complete: 1hr