All articles
Share

Detecting Reconnaissance and Deceptive Pretexting Before Attacks Begin

Pre-Breach Indicators
Humanix
Title
SHARE
SHARE
SHARE

What is feature engineering

In practice, feature engineering is both science and a bit of witchcraft. It often involves both iteration and experimentation to uncover hidden patterns and relationships within the data. For instance, a data scientist might transform raw sales data into features such as average purchase value, purchase frequency, or customer lifetime value, which can significantly boost the performance of a churn prediction model. By thoughtfully engineering features, practitioners can provide machine learning models with the most informative inputs, ultimately leading to better accuracy and more robust predictions.

What’s more?

  • Incorporate more and more data sources
  • Feature engineering platform

What is data engineering

As we mentioned above, feature engineering is certainly a subset of data engineering. It involves the ingestion of data from a source, applying a series of transformations, and making the final result available to be queried by a model for training purposes. You can construct feature engineering pipelines to resemble data engineering pipelines, having schedules, specific source and sink destinations, and availability for querying. However, this configuration would only really apply once you have surpassed the experimentation stage and determined a need for a consistent flow of new feature data.

What is feature engineering

Image description

1. Functions

Functionally, there is nothing to differentiate data vs features - data points (link). Where feature engineering and data engineering really differ is in the objectives and motivations for constructing the pipelines. In general, data engineering serves a broader, more unified purpose than feature engineering. Data engineering platforms are constructed to be flexible and universal, ingesting various types and sources of data into a unified storage location where any number of transformations and use cases can be applied. The intent of a well constructed fact table or gold layer in a data lake is to provide a single source of truth that answers many different questions, produces many reports, and can be consumed by many downstream customers.

2. Practise

And in practice, an organization’s data engineering team will be responsible for the curation and maintenance of all data pipelines, not just those that relate to machine learning. These pipelines may power BI dashboards used by C-Suite, auditing reports that feed payroll, or event logs that show a user’s history of actions within the application.

Feature engineering, on the other hand, serves a specific purpose, finding the tailored inputs and columns that will generate the best predictive results for a machine learning model. Data scientists and machine learning engineers are not tasked with developing a universal data model that will ingest all data points throughout an organization, they just need to select, curate, and clean the data needed to power their models.

3. Machine learning

Now, as machine learning teams grow and begin to incorporate more and more data sources into their models, their feature engineering platform may start to resemble a larger data engineering platform in the tools and methodologies they employ. But, the intent is not to establish flexible data models that can be used throughout the organization - it is simply to power their machine learning models.

The Invisible Preparation Phase

Every successful social engineering attack begins with reconnaissance. Attackers gather intelligence about your organization, employees, and processes before contact.

We often fail to realize the amount of information they post online about people and processes. Attackers scrape LinkedIn for organizational charts, monitor press releases, and study job postings. They find executive interviews and articles, employee’s social media, and other records - each provides a piece of the puzzle. AI can accelerate this process.

The information asymmetry allows attackers to know enough to seem legitimate. But they lack deep insider knowledge. This is their weakness.

Reconnaissance typically occurs outside your perimeter through OSINT, making direct detection impossible. However, the pretexting attempts that follow create observable patterns. If we follow this trail, we can stop social engineering.

Pretexting Patterns That Reveal Reconnaissance

Information asymmetry exposes impersonation. Attackers possess surface details from reconnaissance but lack the contextual knowledge of legitimate employees. The "executive" who knows company initiatives but not their direct reports' names. The "vendor" citing project names but unaware of contract specifics. The "traveling employee" with correct manager information but wrong departmental processes. Legitimate employees have rich mental models about processes and timelines that are not available to others. These gaps reveal reconnaissance-based attacks.

Legitimate users might not be happy, but they rarely resist verification. Attackers have no other choice. They craft elaborate explanations for why standard authentication won't work: lost phones, system issues, and travel complications are bountiful. They've researched enough to know verification will fail, so they prepare justifications to side-step it.

Probing behavior reveals information gathering through questions about procedures, calls mapping approval processes, contacts testing responses. Questions about password reset procedures disguised as policy clarification. Calls to multiple departments mapping approval processes. These reconnaissance contacts precede actual attacks by days or weeks, providing early warning if detected. They can be easily missed.

Temporal patterns expose campaigns: first-time callers requesting immediate access, multiple similar contacts, off-hours requests. Requests during off-hours when verification is difficult. These timing anomalies, especially from unknown contacts, signal potential reconnaissance or pretexting.

Building Early Detection Capabilities

Help your help desk build resiliency. Make it a part of your organizational culture and adopt technologies that support it.

Document external contacts with security-sensitive functions: caller details, stated purpose, information requested, verification outcomes. This contact history reveals patterns invisible to individual interactions. When multiple employees receive similar probing calls, that’s active reconnaissance.

Establish knowledge benchmarks by role. Define what information legitimate executives, vendors, and employees should possess. When callers fail these knowledge checks despite having partial information, you've detected reconnaissance-based impersonation.

Deploy conversation analysis tools that identify pretexting language patterns. Excessive detail in backstories, scripted responses to challenges, and reluctance to provide callback numbers all indicate prepared deception. Real-time analysis can flag these patterns during active calls.

Correlate failed verifications with attacks. Password resets following multiple failed attempts indicate reconnaissance-enabled social engineering.

Recommended Actions

Immediate steps: Log all help desk verification failures and bypass requests. Alert on first-time callers requesting sensitive actions.

Prevention focus: Share detected pretexting attempts organization-wide. When employees know current attack narratives, they become immune to those specific pretexts.

Reconnaissance and pretexting represent the attack phases where detection costs least and prevents most. Organizations that identify these early indicators stop attacks before compromise—the difference between investigation and breach.

Implementation resources:

  • MITRE ATT&CK reconnaissance techniques (T1598, T1593, T1589)
  • OSINT framework for understanding attacker reconnaissance methods
  • Social Engineering Toolkit documentation for pretext examples
  • Human Threat Detection and Response platforms for conversation analysis
  • Industry Information Sharing and Analysis Centers (ISACs) for active campaign intelligence

Enter your work email and we'll reach out to schedule the demo

Oops! Something went wrong while submitting the form.