Abstract: This paper introduces a unified Bayesian approach to 3–D computer visionusing segmented image features. The theoretical part summarizes the basic require-ments of statistical object recognition systems. Non–standard types of models are intro-duced using parametric probability density functions, which allow the implementationof Bayesian classifiers for object recognition purposes. The importance of model den-sities is demonstrated by concrete examples. Normally distributed features are used forautomatic learning, localization, and classification. The contribution concludes with theexperimental evaluation of the presented theoretical approach.1 IntroductionClassification in computer vision is commonly dominated by geometrical, model–based approaches (Faugeras (1993)). Heuristics for many algorithms in imageprocessing restricted to the given problem domain and motivated by associatedapplications are reported in the literature. Herein, model–based image analysisprovides the scientific framework for matching algorithms and for understandingthe information process. The comprehensive goal is to describe the intrinsiccharacter of images in a symbolic or parametric manner.Bayesian methods have provided solutions to various classical problems in patternrecognition. Especially the progress in the field of speech processing is substan-tially based on the application of statistical methods. The general use of Bayesianclassifiers is motivated by several aspects: they show optimality in a decisiontheoretic sense under a 0–1 cost function (Duda and Hart (1973)). Furthermore,statistical methods can deal with uncertainty in a natural manner, have a wellelaborated mathematical theory, and provide a unified framework within whichmany different tasks can be considered. For that reason, we favor model–basedcomputer vision algorithms which apply statistical discriminants or, at least, closeapproximations of Bayesian classifiers.In this paper, we present a probabilistic framework for 3–D vision: statisticalmethods for object modeling, algorithms for the automatic estimation of modelparameters — even in the presence of incomplete and disturbed training data —,classification rules, and localization methods for 3–D objects using 2–D views.The introduced model densities show several degrees of freedom, and standardhidden Markov models or mixtures of densities can be derived by specialization.The experiments prove that the classification and pose estimation task for 3–Dobjects using real image data can be treated statistically
Sunday, January 8, 2023
How does image recognition work for humans?
When we see an object or an image, we, as human people, are able to know immediately and precisely what it is. People class everything they see on different sorts of categories based on attributes we identify on the set of objects. That way, even though we don’t know exactly what an object is, we are usually able to compare it to different categories of objects we have already seen in the past and classify it based on its attributes. Let’s take the example of an animal that is unknown to us. Even if we cannot clearly identify what animal it is, we are still able to identify it as an animal.
People rarely think about what they are observing and how they can identify objects, it completely happens subconsciously. People aren’t focused on everything that surrounds them all the time. Our brain has been trained to identify objects quite easily, based on our previous experiences, that is to say, objects we have already encountered in the past. We do have an extraordinary power of deduction: when we see something that resembles an object we have already seen before, we are able to deduce that it belongs to a certain category of items. We don’t necessarily need to look at every part of an image to identify the objects in it. As soon as you see a part of the item that you recognized, you know what it is. We usually use colors and contrasts to identify items.
For humans, most image recognition works subconsciously. But it is a lot more complicated when it comes to image recognition with machines.
Deep learning
Deep learning is a subset of machine learning, which is essentially a neural network with three or more layers. These neural networks attempt to simulate the behavior of the human brain—albeit far from matching its ability—allowing it to “learn” from large amounts of data. While a neural network with a single layer can still make approximate predictions, additional hidden layers can help to optimize and refine for accuracy.
Deep learning drives many artificial intelligence (AI) applications and services that improve automation, performing analytical and physical tasks without human intervention. Deep learning technology lies behind everyday products and services (such as digital assistants, voice-enabled TV remotes, and credit card fraud detection) as well as emerging technologies (such as self-driving cars).
Wednesday, January 4, 2023
Mathematical and Statistical Opportunities in Cyber Security
The role of mathematics in a complex system such as the Internet has yet to be deeply explored.
In this paper, we summarize some of the important and pressing problems in cyber security from
the viewpoint of open science environments. We start by posing the question “What fundamental
problems exist within cyber security research that can be helped by advanced mathematics and
statistics?” Our first and most important assumption is that access to real-world data is necessary
to understand large and complex systems like the Internet. Our second assumption is that many
proposed cyber security solutions could critically damage both the openness and the productivity of
scientific research. After examining a range of cyber security problems, we come to the conclusion
that the field of cyber security poses a rich set of new and exciting research opportunities for the
mathematical and statistical sciences.
∗This work was supported by the Director, Office of Science, of the U.S. Department of Energy under Contract
No. DE-AC02-05CH11231.
†High Performance Computing Research, Lawrence Berkeley National Laboratory (JCMeza@lbl.gov).
‡National Energy Research Scientific Computing Center (SCampbell@lbl.gov).
§High Performance Computing Research, Lawrence Berkeley National Laboratory (DHBailey@lbl.gov).
1
2
1 Introduction
A cyber security incident of some sort makes the news headlines on an almost daily basis. The
examples are numerous, from individual users information loss, to worms and computer viruses, to
large scale criminal behavior precipitated by organized crime and nation states. More recently [22],
the large-scale use of botnets for distributing e-mail spam, distributed denial of service attacks,
and distributing other malware has led to an informal alliance of computer security experts from
across the world. Not surprisingly, the rise in cyber security incidents is due in large part to
the rise in the use of computers and the Internet in all areas of society. In fact, according to [1],
“Incessant scanning of hosts by attackers looking for vulnerable servers has become a fact of Internet
life”. Therefore it comes as no surprise that scientific research has also been affected by cyber
security problems. Indeed, because so much of science has embraced and come to rely so heavily
on computing resources, it is particularly vulnerable to cyber security issues.
While the growth in computational, network, and data resources has completely changed the
way that basic scientific research is conducted, this has in general not been reflected in the way
that computer science has addressed challenges facing cyber security research. A recent report by
the National Research Council [15] states that, “research can produce a better understanding of
why cyberspace is as vulnerable as it is and that such research can lead to new technologies and
policies and their effective implementation, making cyberspace safer and more secure.” However,
the committee was also careful to mention that, “there are no single or even small number of
silver bullets that can solve the cybersecurity problem”. At almost the same time, a grass roots
community effort recently released a report [10] that outlines several areas for a science-based
cyber security research program. The report outlined three focus areas: predictive awareness
for secure systems, self-protective data and software, and trustworthy systems from trustworthy
components. The three focus areas were further subdivided into specific research areas. A similar
report [9] highlighted three major challenge areas: modeling large-scale networks, threat discovery,
and network dynamics.
Interestingly, the role of mathematics in a complex system such as the Internet has yet to be
deeply explored. Willinger and Paxson [36] pointed out as early as 1998 that, “the Internet is
a new world, one where engineering wins out over tradition-conscious mathematics and requires
paradigm shifts that favor a combination of mathematical beauty and high potential for contributing
to pragmatic Internet engineering.” In that same spirit, we would like to ask “What fundamental
problems exist within cyber security research that can be helped by advanced mathematics and
statistics?” In this paper, we summarize some of the more important and pressing problems in
cyber security from the viewpoint of open science environments, and highlight those which we
believe should be of interest to the general mathematical sciences community.
We wish to stress the importance of using a science-based approach. Our first and most important assumption is that, as in other scientific fields, access to real-world data is necessary to
understand large and complex systems. Applying mathematical models and advanced algorithms
to this real-world data, it should be possible to develop validated predictive models that can then
Math Opportunities in Cyber Security LBNL-1667E (2009)
3
be used to develop more robust applications for workable cyber security. With the rise of cyber
security incidents and more persistent and stronger adversaries, it makes sense to move away from
current ad-hoc, reactive methodologies and limited testing to more rigorous and repeatable approaches. Our second assumption is that many proposed cyber security solutions could critically
damage both the openness and the productivity of scientific research. As such, we want to emphasize that using security models from other areas (e.g. the classified sectors) and relaxing the
conditions usually results in a model that is inherently detrimental to open science. We also note
that many of the challenges in cyber security arise from characteristics of the Internet that make it
difficult to model, such as: 1) the self-similar structure of network traffic, 2) the inherent dynamic
nature of the Internet, and 3) the rapid growth in the Internet, both in terms of the number of
components and size of the traffic.
This paper is divided into three sections: 1) data, 2) modeling, and 3) applications. Section 2
make the case for good raw data as the foundation for basing cyber security models and policies.
Section 3 outlines some of the major areas where improved modeling is needed. The final section
describes the use of some models and data to implement applications that have been used to detect
and deter adversaries. In all three areas, we highlight some of the mathematical opportunities that
arise as researchers have studied these areas, including recent advances using statistical techniques.
2 The Importance of Data
The need for trace and log data for scientific analysis is necessary not only to create accurate
models, but to provide repeatability and verification of results. In addition, ensuring that the
data’s integrity is maintained throughout the lifetime of its intended use is imperative in order to
be able to validate the results of any scientific model. As a fundamental building block of repeatable
science, we see the lack of freely available raw data as an issue that is both critical for success and
a problem that can be addressed through better mathematical modeling and techniques. This
section deals with three areas that lay the foundation for almost all of the cyber security defense
mechanisms in place today: 1) the need for real-world data on which to base network models, 2) the
need for developing methods that ensure data integrity, and 3) the need to handle large amounts
of data in real or almost real time.
2.1 Data Anonymization and Cleaning
The lack of public data sets for network modeling has been identified as a key weakness in current
networking research [30]. In addition, most intrusion detection systems are based on anomaly
detection for which one of the key assumptions is the availability of good training data. The
generally accepted assumptions for the training data include: 1) attack-free data is available, 2)
simulated data is representative, and 3) network traffic is static. Gates and Taylor [13], however,
Math Opportunities in Cyber Security LBNL-1667E (2009)
4
challenge these assumptions and argue that most of these assumptions may not hold in many
situations.
Many sites are reluctant to publicly release network data for a variety of reasons including
confidentiality, privacy, and security issues. Given the need for high-quality data however, some
researchers have studied the question of how best to anonymize or sanitize the data so that it
can be released publicly. In all of these cases, there is an inherent tradeoff between the need to
ensure security and privacy and the need for high-quality data that still represents real-world traffic
data. This is known as the utility versus security trade-off [33]. Several approaches for sanitizing
data have been suggested with various degrees of success [2]. It is also important to consider how
effective a particular anonymization policy is and there have been some efforts to measure this [20].
The diversity of techniques used in tackling the data indicate that a more systematic approach
might be applied to this problem as well as methods for maximizing a particular utility function.
The application of data anonymization methods is not restricted to raw data however. Higher layer
abstractions such as NetFlow records (Cisco IOS NetFlow. http://www.cisco.com/go/netflow) or
Bro connection logs [29] are also extremely powerful for large scale measurement and modeling.
In addition to releasing (possibly anonymized) real data, there are other ways of generating
test data – synthesis and reference data. However, generating synthetic data can be problematic
in terms of research value [23], while reference data (data recorded at locations such as honey pots
which have no privacy constraints) may not fulfill the needs of the researcher due to the specific
nature of the traffic [2].
2.2 Data Integrity
As the data sets in science grow and as our dependence on them for understanding science increases,
there is a critical need for ensuring that data maintains its integrity over the lifetime of its intended
use. By integrity here, we mean the trustworthiness of either the data or resources and includes
data integrity (content) and origin integrity (the source of the data). As described by Bishop[3],
integrity mechanisms fall into two main classes: prevention mechanisms and detection mechanisms.
New methods will need to be developed that can ensure that the data being generated by large
experimental facilities such as the Large Hadron Collider, ITER, or any of the large accelerators,
maintains its integrity during the course of the analysis of said data. Likewise, as programs such as
the Department of Energy’s Scientific Discovery through Advanced Computing (SciDAC) expand
towards exascale facilities, the data sets generated by the codes that the SciDAC program support
can be expected to also grow in size. This simulation output will also require methods for maintaining data integrity. One interesting new approach for securing the provenance of data was suggested
by [4]. They note that provenance can be modeled as a causality graph with annotations, where
the causality graph describes the process that produced the data’s present state. As such, the
graph can be represented as an immutable directed acyclic graph (DAG). Although, they present
a security solution, they also note that more research is required to construct a security model for
causal graphs.
Math Opportunities in Cyber Security LBNL-1667E (2009)
5
2.3 Real-time data
In order to respond quickly to a cyber security attack, organizations need to analyze high-volumes
of traffic data and detect anomalies in real time. Dreger et al. [8] cite some examples from two operational cases that consisted of networks with tens of thousands of hosts, transferring 2-3 terabytes
of data/day and 44,000 packets/sec on average. Some interesting work using statistical techniques
such as sequential hypothesis testing has shown that this is possible [17]. The basic idea is to model
accesses to local IP addresses as a random walk on one of two stochastic processes, corresponding respectively to a benign and a malicious process. The use of sequential hypothesis testing is
intriguing because it can be used to establish mathematical bounds on the expected performance
of the algorithm. While this work is quite promising, the authors point out that their work only
focused on the detection of an attack from a single remote address. New mathematical models
will be needed to determine whether a coordinated attack from a set of remote addresses is taking
place. In addition, as networks increase in bandwidth and the number of hosts increase, it is clear
that the data sets that need to be analyzed will also grow, requiring new mathematical methods
that can scale with the traffic data.
3 Modeling
3.1 Internet Modeling
Modeling the Internet is well-known to be a difficult problem [11, 28]. Some of the difficulties
include the immense heterogeneity of the Internet and the rapid changes over time. Floyd and
Paxson [11] proposed two strategies for developing meaningful simulations in the face of these difficulties: searching for invariants and judiciously exploring the simulation parameter space. In terms
of invariants, Floyd and Paxson suggest among others: diurnal patterns of activity, self-similarity,
heavy-tailed distributions, and log-normal connection sizes. Searching for these invariants can then
be viewed as a problem in estimating the correct set of parameters from the data. The second
strategy proposed, that of judiciously exploring the parameter space was proposed as a way to
cope with the heterogeneity and change in the Internet architecture. Exploring the simulation
parameter space can also be framed as a mathematical problem. For example, one way is to pose
this as a problem in the design and analysis of computer experiments, for which there is already a
great deal of literature. As the number of parameters increase though, this approach can be quite
limiting. Therefore, techniques for determining which parameters are the most important for a
particular model or determining the sensitivity of the simulation to certain parameters will need
to be developed. Finally, one can also view this as an optimization problem in which one seeks to
minimize certain behavior as defined by an appropriate cost function.
Math Opportunities in Cyber Security LBNL-1667E (2009)
6
3.2 Statistical Models
Network traffic is often modeled as a Poisson process because it has good theoretical properties.
Internet traffic, however, has been shown to have some very complex statistical properties [6, 31].
In fact, many studies have shown that simple Poisson models do not hold for real network traffic
including both local area and wide area network traffic. Several studies [21, 37] have shown instead
that local area network can be modeled much better as a self-similar process. An interesting study
on Internet traffic data that describes these phenomena was provided by Cleveland and Sun [6]. In
addition to an excellent description of the problem, Cleveland and Sun suggest several challenges
for handling traffic data including: statistical tools and models for point processes, marked point
processes, and time series that account for nonstationarity, persistence, and distributions with long
upper tails, 2) theoretical and empirical exploitations of the superposition of Internet traffic, and 3)
integration of statistical models with network simulators. An excellent bibliography on self-similar
traffic modeling and analysis can be found in [18].
Paxson and Floyd [27] suggest that the Poisson-based models should also be abandoned for
wide-area traffic. In his studies of data sets from the teletraffic industry, Resnick [32] noted that
traffic data often exhibit many non-standard characteristics such as heavy tails and long range
dependence. He also described several estimation methods for the analysis of heavy tailed time series
including parameter estimation and model identification methods for autoregressions and moving
averages. However, in the discussion that followed Resnick’s paper, Willinger and Paxson [35]
argued persuasively for using structural models that take into account the context in which the
data arose as opposed to the black box modeling approach that is more commonly used. Clearly
there is a need for further research and development of statistical techniques and methods for
effectively handling phenomena such as heavy tails and long-range dependence that arise in cyber
security data.
4 Network Intrusion and Attack Response
Network intrusion and attack detection typically results in the application of a successful model
built from real world data. For example if one has a working model of how an external adversary
might scan one or more hosts on a local network, one can build a simple detector based on this.
The decision to build depends on questions of scale, time and threat model. Scale might be a single
host, a local network event or even an organized entity with hundreds of thousands of systems.
Each notion of scale describes (when taken with a temporal component) a problem space in cyber security. One commonly used strategy for detection is through anomaly detection, with the
explicit assumption that any malicious behavior is anomalous [13]. As a result, many approaches
for anomaly detection have been proposed including, support vector regression [24], k-means [19],
multivariate adaptive regression splines [26], Kalman filters [34], and sequential hypothesis testing [17]. As the data set sizes increase and the need to more quickly detect intrusions, more robust,
accurate, and efficient algorithms will almost certainly be required.
Math Opportunities in Cyber Security LBNL-1667E (2009)
7
4.1 Attack Detection
Current state of the art in attack detection provides many areas that can be assisted by improvements to model or algorithm design. Network intrusion detection has traditionally been focused
on identifying attackers who are seeking information about internal systems and services, sometimes over large address spaces or large time periods. A fundamental component of this is scan
detection where one or more remote network systems look over address ranges to survey available
services. To do this an adversary needs to scan some range of address space. Single host detection
has proven to be predictably accurate via sequential probability testing by Jung et al. [17], but
there are significant areas for improvement in the detection of distributed scanning, see for example
Gates [14].
Understanding how to represent both attack and defense is essential for developing a workable
strategy. Examples of this that might be extendable via a more complete analysis of the problem
space are [7, 25], which look at modeling an intrusion detection system’s observable attack space
as well as optimal placement strategies. For example, Modelo-Howard [25] proposed a new method
based on a Bayesian network model that can characterize the relationship between attack steps and
detectors. This resulted in an algorithm that could evaluate the effect of detector configuration on
system security. A question that was left for future work was whether the solution was scalable
to larger attack graphs and more detectors. Similarly, Collins [7] argues that attacks should be
viewed as a design specification, where the attacker is an engineer with specified goals. His proposed
solution involves estimating a detection surface via multiple Monte Carlo runs to build up a model
for the probability of detection. This naturally leads to the question of which models work best
for different attack scenarios and how to best estimate them so as to reduce the number of false
positives while still detecting the true attacks.
4.2 Automated Attack Response
Recently, there has been considerable work on developing capabilities to auto detect attacks based
on both network and system behavior in order to reduce the time between attack detection and
response. Autogeneration of network signatures based on protocol and attack heuristics has been
explored by Yegneswaran et al. [38, 39]. System call deviations based on static and dynamic analysis
of downloaded binaries have also been studied by a number of people including Christodorescu et
al. [5].
4.3 Complex Systems and Other Novel Approaches
Other types of models have recently been proposed. For example, Forrest and Hofmeyr [12, 16] have
described models for network intrusion detection and virus detection based on an immunological
Math Opportunities in Cyber Security LBNL-1667E (2009)
8
distinction between “self and “nonself.” Using the analogy between an immune system they have
studied problems in computer virus detection, host-based intrusion detection, automated response,
and network intrusion detection. For the network intrusion detection study, the system was tested
on two months of network traffic data collected on a subnet of 50 computers at the Computer Science
department at the University of New Mexico. While preliminary, the results seem promising in that
the false positive rate was on the order of two per day and the system was also able to successfully
detect all seven intrusion incidents that were inserted into the system.
Another interesting approach was proposed by Zou et al. [41]. They proposed an adaptive
defense principle based on minimizing a particular cost function. The cost function depended on
the attach severity, attack traffic and some other metrics that are determined by the types of
attacks. They also presented a system design based on this approach to defend against SYN flood
DDos attack and Internet worm infection.
Finally, Zhou et al. [40] have proposed several novel alert correlation algorithms for network
intrusion detection that reduce the number of false alerts.The basic building block of the model
is a logical formula called a capability. They use the notion of capability to abstract consistently
and precisely all levels of accesses obtained by the attacker in each step of a multistage intrusion.
The correlation algorithm is based on a new set searching algorithm that captures the case where
multiple earlier attacks together support a new attack. The experimental results of the correlator
using several intrusion datasets demonstrate that the approach is effective in both alert fusion and
alert correlation and has the ability to correlate alerts of complex multistage intrusions.
5 Conclusions
In this paper, we presented some of what we believe are the most important problems in cyber
security for open science environments and highlighted those areas where mathematics and statistics
could provide new approaches and solutions. The use of mathematics and statistics in this field is
relatively new and much remains to be done. We also believe that the type of mathematics needed
to address problems in cyber security will likely come from the use of non-traditional methods or
techniques. In summary, to paraphrase from Willinger and Paxson [36], we believe that the field
of cyber security poses a rich set of new and exciting research opportunities for the mathematical
and statistical sciences.
References
[1] Mark Allman, Vern Paxson, and Jeff Terrell. A brief history of scanning. In IMC ’07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pages 77–82, New
Math Opportunities in Cyber Security LBNL-1667E (2009)
9
York, NY, USA, 2007. ACM.
[2] M. Bishop, R. Crawford, B. Bhumiratana, L. Clark, and K. Levitt. Some problems in sanitizing
network data. In 15th IEEE International Workshops on Enabling Technologies: Infrastructure
for Collaborative Enterprise, pages 307–312, 2006.
[3] Matt Bishop. Computer Security Art and Science. Addison Wesley, 2003.
[4] Uri Braun, Avraham Shinnar, and Margo Seltzer. Securing provenance. In HOTSEC’08:
Proceedings of the 3rd conference on Hot topics in security, pages 1–5, Berkeley, CA, USA,
2008. USENIX Association.
[5] M. Christodorescu, S. Seshia, S. Jha, D. Song, and R. E. Bryant. Semantics-aware malware
detection. In IEEE Symposium on Security and Privacy, June 2005.
[6] W.S. Cleveland and Don X. Sun. Internet traffic data. Journal American Statistical Association, 95:979–985, 2000. Reprinted in Statistics in the 21st Century , edited by A. E. Raftery,
M. A. Tanner, and M. T. Wells, Chapman & Hall/CRC, New York, 2002.
[7] M. Collins and M. Reiter. On the limits of payload-oblivious network attack detection. In
Recent Advances in Intrusion Detection, May 2008.
[8] Holger Dreger, Anja Feldmann, Vern Paxson, and Robin Sommer. Operational experiences
with high-volume network intrusion detection. In CCS ’04: Proceedings of the 11th ACM
conference on Computer and communications security, pages 2–11, New York, NY, USA, 2004.
ACM.
[9] Daniel M. Dunlavy, Bruce Hendrickson, and Tamara G. Kolda. Mathematical challenges in
cybersecurity. Technical Report SAND2009-0805, Sandia National Laboratories, February
2009.
[10] Charlie Catlett (Ed.). A scientific research and development approach to cyber security. Report
submitted to the U.S. Department of Energy, 2008.
[11] S. Floyd and V. Paxson. Difficulties in simulating the internet. IEEE/ACM Transactions on
Networking, 9(4):392–403, Aug. 2001.
[12] Stephanie Forrest and Steven A. Hofmeyr. Immunology as information processing. In Design
Principles for the Immune System and Other Distributed Autonomous Systems, edited by L.A.
Segel and I. Cohen. Santa Fe Institute Studies in the Sciences of Complexity, pages 361–387.
Oxford University Press, 2000.
[13] Carrie Gates and Carol Taylor. Challenging the anomaly detection paradigm: a provocative
discussion. In NSPW ’06: Proceedings of the 2006 workshop on new security paradigms, pages
21–29, New York, NY, USA, 2007. ACM.
[14] Carrrie Gates. Coordinated scan detection. In 16th Annual Network and Distributed System
Security Symposium, San Diego, CA, February 8–11 2009. Internet Societ
A complete guide to math in cybersecurity
There is a severe shortage of qualified cybersecurity professionals. The demand
for employees at every level is high, and every indication is that this need
will continue to grow. Knowledge of how your skills, interests, experiences, and
aptitudes align with those needed for success in cybersecurity can help figure
out the best way to get started in the industry. Finding that perfect career is,
at best, tricky. Cybersecurity is a technical field and one that at its core,
requires strong quantitative skills. This guide is all about how math is used in
cybersecurity and the best way to prepare for a math-driven cybersecurity
career. Cybersecurity as a science The nearly global ubiquitous use of computers
in every aspect of life makes understanding the behind-the-screens technology at
once easy to ignore and difficult to understand. In the main, if the desktop,
laptop, tablet, or mobile device does what we expect it to do, we give little
thought to the bits and bytes that scurry behind the screen to make it operate.
On the occasion that we find ourselves contemplating what magic makes these
devices so incredibly powerful, we, of necessity, metaphorically throw our hands
up in exclamation that there is just too much technology crammed into our
electronics for any one person to grasp. If that is how you feel, you are not
alone, and you are not wrong. There is too much technology in our computing and
communication devices for any one person to understand it all. It takes teams of
experts in many fields, working in concert to conceptualize, design,
manufacture, program, configure, protect, and deploy each piece of technology
that we take for granted. The common denominator for these experts is that they
each must be proficient in the core academic disciplines of science, technology,
engineering, and math (STEM). While all STEM disciplines require a good deal of
math, this guide will focus on math as it is needed to be successful in the
general field of computer science and, more specifically, cybersecurity.
Cybersecurity is a sub-discipline of computer science, and many cybersecurity
jobs require less STEM education than does becoming a computer scientist. Often
people paint themselves and others with too broad a brush and declare they are
either creative or logical. Mathematical aptitude is generally attributed to
logical or methodical thinkers. While this is often true, the ability to apply
reason consistently does not preclude the ability to be creative. The creative
mind can express itself using mathematical equations in a most decidedly artful
form. Rather than letting either of these labels deter you from pursuing STEM
fields, consider your relationship with numbers instead. How you feel about
using numbers may be a better barometer of how well you will adapt to STEM
fields. Ask yourself if you enjoy working with numbers and using them to convey
concepts and ideas. If you do, and you can think analytically with a focus on
details, you may have the natural inclination for a career that uses numbers. If
you enjoy numbers, you are likely well suited for fields that require an
understanding of math. If you also enjoy complex puzzles and helping others, you
are probably well suited for work in the field of cybersecurity. People that
enjoy working with numbers Math plays an essential role in many careers. From
science, to finance, to communications, many knowledge-based professions require
excellence and aptitude in mathematics and quantitative reasoning. These careers
also emphasize logical problem solving, critical thinking, and decision making.
These are skills honed through the study of math. To gain a general
understanding of your relationship with numbers, consider the following traits,
skills, and abilities. Traits, skills, and abilities of “lovers of numbers”
include: An ability to achieve goals by constructing a path of reason back from
the desired result to the current state of an issue — or to reverse engineer a
problem to find a solution An ability to quickly visualize abstract concepts,
quantitative relationships, and spatial connections An ability to understand,
communicate, and model using symbols and numbers An ability to think
analytically and offer or receive criticism of ideas and concepts without
involving feelings and emotions An ability to identify and categorize patterns
and relationships An ability to use numbers as justifications to confidently
take risks An ability to track and follow details and work with precision An
ability to display patience as large complex problems are worked out It is not
necessary to be a “lover of numbers” to be successful in cybersecurity, but the
higher the number traits, skills, and abilities listed above that you can claim
as yours, the more likely you are to enjoy a numbers-based job. How math is used
in cybersecurity Cybersecurity is not generally considered to be a
math-intensive profession. That is not to say, however, that familiarity and
comfort with math will not be hugely beneficial for success in cybersecurity. On
the contrary, to advance beyond an entry-level cybersecurity position, a
candidate should be comfortable with high school level math, at least. Whether
expressed as (threat x vulnerability) or (probability x loss) or in some other
more sophisticated fashion, determining risk is a mathematical exercise. At some
level, all security professionals are in the risk calculation business. For many
security workers, this calculation is performed almost subconsciously many times
each day in the execution of their duties. Knowing what is essential and where
to spend time and resources for the most significant result is the essence of
the ability to understand risk. If on the front lines of a Security Operations
Center (SOC), a security specialist can be flooded with security alerts. They
must analyze these alerts and make a quick risk assessment to know what they can
handle now and what must be escalated for further investigation. This can be
overwhelming at times and requires an ability to calculate risk very quickly. A
security code auditor will find herself examining code written by other coders.
While many analytical tools are available to assist, she must be able, at a
glance, to recognize weaknesses and vulnerabilities in the code. Writing and
understanding computer software code requires mathematical skills. Binary math
is how computer operations are computed. It is used in everything from
establishing IP addresses to network routing. The word binary means composed of,
or involving two things. A binary number is made up of bits, each having a value
of 0 or 1. A bit (short for binary digit) is the smallest unit of data in a
computer. Computers generally store data and execute instructions in bit
multiples called bytes. In most computer systems, there are eight bits in a
byte. Every number in your computer is an electrical signal, and when these
machines were initially designed, electrical signals were difficult to precisely
measure and control. It made more sense to only distinguish between an “on”
state — represented by negative charge — and an “off” state — represented by a
positive charge. Thus today, binary math is at the heart of all computer machine
language and software. Another math-based concept used in cybersecurity is
hexadecimal math. Rather than having only two options, as in binary math,
hexadecimal math is based on the idea that you can count up to any one of 16
different options. You count these options from 0 to 15, providing sixteen total
choices. Since one-digit numbers only range from 0 to a 9 (10 takes up two
digits), you have to represent everything from 10 up to 15 as something else, in
this case, using the letters A through F. Entry-level cybersecurity jobs will
require at least some understanding of computer coding or programming. Computer
code is written with math as its foundation. Coders need to understand
programming concepts like constraints, variables, and programming logic. For
example, you would be required to understand a basic computer code like this
elementary if-else statement: var x = 1; if (x === 1) { window.alert(“The
expression is true!”); } else { window.alert(“The expression is false!”); } The
above is a simple example of a computer code. Still, from this, you can see that
you’ll need to have an understanding of mathematical logic and how a computer
will interpret information. Boolean algebra has been fundamental in the
development of digital electronics. Although first introduced by George Boole in
his book The Mathematical Analysis of Logic in 1847, Boolean algebra is applied
in modern programming languages. Whereas in elementary algebra, expressions
indicate mainly numbers, in Boolean algebra, they signify the values false and
true. It deals with operations on logical values and incorporates binary
variables of 0 and 1. Cryptography is the science of codes and encryption and is
based on mathematical theory. Cryptographic techniques are at the very heart of
information security and data confidentiality. The math used in cryptography can
range from the very basic to highly advanced. Cryptographic algorithms are
composed around computational hardness assumptions. A computational hardness
assumption is a hypothesis that a particular problem cannot be solved
efficiently, making such algorithms hard to break in practice by any adversary.
They are also used by cyber-adversaries and are integral to ransomware.
Cryptovirology is a domain that considers how to use cryptography to design
robust malicious software. In mathematics and computer science, an algorithm is
a calculable pattern of clear, computer-implementable directions. They are used
to solve problems or to complete computations. Algorithms are crucial to
computer science and cybersecurity. They are used as blueprints for executing
calculations, data processing, automated reasoning, and other tasks. Math
requirements for education in cybersecurity Probably the most effective way to
compare your math aptitude against the requirements for a career in
cybersecurity is to examine the math requirements for various degree and
certification programs in the field. If you have taken and passed these courses,
or if you feel confident that you could complete them successfully, it would be
an excellent indication that your interests and skills are a good match for a
career in cybersecurity. The requirements to use math in cybersecurity work are
not so compelling that a degree in math would be suitable for any but the most
technical cybersecurity research positions. These plum jobs exist, but a degree
or certificate in a security-related field will be, in most cases, preferable to
a degree in math. As you scan the below cybersecurity-related certification and
degree program course descriptions, look for the underlined words to guide your
understanding of where math skills may be required. It is not practicable to
list all the math requirements for all the prerequisite courses, but these
samples will provide a reasonable understanding of what is generally needed.
Whether or not you decide to pursue a formal security-related degree program, a
professional cybersecurity certification will go a long way toward advancing
your career. While there are many applicable certifications to choose from,
people that have taken the CompTIA exams report that: The Security+ exam
requires only arithmetic and calculating the risk formula The Security+ exam
requires math for IP/MAC addressing The Network+ exam requires math for figuring
out subnet information The A+ 220-801 exam requires you to remember and use the
equation for calculating the transfer rate of different memory types Many
cybersecurity associate degree programs do not list any math-related classes in
the list of required courses. Presumably, then, high school graduation would be
the only prerequisite needed to qualify for entry-level security positions
requiring an associate degree. Gaining expertise and preparing for cybersecurity
industry certifications are precisely the two areas where cybersecurity
associate degree programs shine. Whether stand-alone programs designed to
quickly train students for the digital workforce or as a step toward more
cybersecurity education, such as a bachelor’s degree in cybersecurity or a
cybersecurity master’s or Ph.D., associate degree programs play an essential
role in cybersecurity education. As an example of the math required for a
bachelor of engineering degree, consider a BSE degree from Arizona State
University. They list as prerequisites for their junior year concentration in
computer system security the following courses: Computer Sci BS or Computer
Systems Engr BSE major CSE 310 – Data Structures and Algorithms. Advanced data
structures and algorithms, including stacks, queues, trees (B, B+, AVL), and
graphs. Searching for graphs, hashing, external sorting. CSE 365 – Information
Assurance. Concepts of information assurance (IA); basic IA techniques,
policies, risk management, administration, legal, and ethics issues. SER 222 –
Design and Analysis of Data Structures and Algorithms. Data structures and
related algorithms for their specification, complexity analysis, implementation,
and application. Sorting and searching, as well as professional responsibilities
that are part of program development, documentation, and testing. The level of
math required for success in these courses is consistent with other engineering
degrees. A student should be confident to enter a BSE program with a good
understanding of high school level algebra, geometry, and calculus. As you would
expect, the math requirements for a master of science degree are more stringent
and demanding. To meet what they see as a burgeoning demand, Boston University
offers its MS students the opportunity to specialize in cybersecurity. This
specialization encompasses courses that focus on technical issues related to
safe software, languages, and architectures, as well as broader societal issues
of privacy and legal ramifications. An eight-course program trains students in a
range of topics, including: Cryptographic methods Data and information security
Fault-tolerant computing Network security Privacy and anonymity Software safety
System security Cryptographic techniques are math-intensive, but students having
completed a BSE degree should have confidence they can be successful in this
course of study. The Ph.D. is the highest degree awarded by universities in the
United States and represents the pinnacle of academic achievement. The
University of Colorado, Colorado Springs (UCCS) offers a security specialty in
their Ph.D. Security degree program. This new multidisciplinary specialty offers
the security curriculum for students to study and conduct multidisciplinary
research in areas of cybersecurity, physical security, and homeland security,
which have become critical and increasingly urgent in today’s personal,
business, and government operations. Validated by the NSA’s Information
Assurance Courseware Evaluation (IACE) Program, UCCS’s Ph.D. program includes:
CS3910 – System Administration and Security. Covers the installation and
configuration of mainstream operating systems, important network services,
disaster recovery procedures, and techniques for ensuring the security of the
system. CS4200-5200 – Computer Architecture. Computer architecture is the
science and art of selecting and interconnecting hardware components to create a
computer that meets functional, performance, and cost goals. In this course, you
learn how to completely design a correct single processor computer, including
processor datapath, processor control, pipelining optimization,
instruction-level parallelism and multi-core, memory/cache systems, and I/O. You
will see that no magic is required to design a computer. You will learn how to
quantitatively measure and evaluate the performance of designs. CS5220 –
Computer Communications. The subject of transmitting information between
processors is described in detail. The student is expected to have maturity with
hardware and/or real-time concepts. Communication systems, from simple to
asynchronous point-to-point links, to those based on complex network
architectures, will be studied. The material will be oriented toward the
computer scientist as a user, designer, and evaluator of such systems.
Terminology and concepts will be emphasized rather than detailed electronic or
physical theory. CS5920 – Applied Cryptography. Basic security issues in
computer communication, classical cryptographic algorithms, symmetric-key
cryptography, public-key cryptography, authentication, and digital signatures.
CS6910 – Advanced System Security Design. Advanced topics in network and system
security, including firewall design, network intrusion detection, tracking and
prevention, virus detection, programming language, and OS support for security
and wireless network security. Without a fondness for numbers, you are likely to
find a Ph.D. program in cybersecurity difficult; however, there are many high
level, even C-Suite, jobs in cybersecurity that do not require a Ph.D.
Conclusion Technology increases at break-neck speed. Year after year,
computer-based technological advances have shaped and revolutionized how we
interact with the world, a world that was inconceivable a few short decades ago.
For many people, trying to find where they fit into this high-tech world can be
a challenge. Attempting to match their interests and aptitudes to a future
career can be confusing. Many careers in technical fields require the use of
math. The quickly growing field of cybersecurity is no exception. Entry-level
careers require at least high-school level math and algebra, and highly
technical security jobs require even more advanced math. There are, however, few
security-centric positions that require math at a level above what is expected
of a student to achieve a Master of Science degree. There are, nevertheless,
many career branches in cybersecurity that are not technical. Like any business,
cybersecurity companies and departments need all types of staff. From
administrative to supervisory, non-technical people make up a large portion of
any organization. Don’t let the labels of “creative person” or “analytical
person” close doors unnecessarily. A love for drawing and art can be indicative
of an ability to conceptualize complex ideas — a handy skill in computer
science. Many successful people have learned to express their creativity in
science. While math is vital for some cybersecurity careers, there are other
more essential skills and characteristics, such as: A value system that holds
helping and protecting others in high esteem An ability to work in a high-stress
environment A willingness to work as part of a team An ability to grasp new and
complex ideas quickly If you can write and understand computer code, you likely
already possess the math skills needed for all but the most technical
cybersecurity roles. If you are a candidate for these highly specialized roles,
you undoubtedly have already tested your aptitude and talent for math in
real-world experiences. The best measure of how your math skills and aptitude
align with technical security jobs is to look at the professional certifications
and degrees that cater to the security industry. This guide has presented some
examples of each. Review these examples and ask yourself if there is anything in
your education, work history, or general interests that would qualify or exclude
you from these programs. Truth be told, the security industry needs you and
will, in all likelihood, be happy to find a place for you. Primary Sidebar
BOOTCAMPS CISSP CCNA CEH CompTIA Security+ Azure CISM CERTIFICATIONS CISA CEH
CISSP CISM Security+ CASP+ CND Forensics OSCP CRISC Pen Testing CTIA
Cryptography Malware Analyst CAREERS Security Engineer Chief Information
Security Officer Security Analyst Computer Forensics Security Consultant Digital
Forensics Cryptographer Security Administrator Penetration Tester Security
Software Developer Security Specialist Security Code Auditor Security Architect
Malware Analyst Data Protection Officer Cybercrime Investigator Cryptanalyst
Security Incident Responder Chief Privacy Officer Risk Manager Network
Administrator Business InfoSec Officer Information Security Manager Cyber
Operations Specialist RESOURCE CENTER Centers for Academic Excellence Job Guide
Veteran’s Guide Women’s Guide Internship Guide Security Clearance Guide Ethical
Hacker Guide Coding for Cybersecurity Guide Cybersecurity 101 Student Guide to
Internet Safety Scholarship Guide Cybersecurity Math Guide Small Business Guide
COVID-19 Guide Cybersecurity for K-12 Students Career Networking Guide What is a
Cyber Range? Code Like a Hacker Reacting to a Cyber Incident Introduction to
Cyber Defense Cybersecurity Courses Online Recommended Reading Cybersecurity
Jobs Report Phishing Attacks Cybersecurity Responsibility How to Get Into
Cybersecurity Cyberwarfare Cybersecurity Insurance Job Interview Prep Readiness
Economy INDUSTRIES Financial Sector Insurance Sector Healthcare Sector
Environmental Sector Energy Sector Government Sector Transportation Sector Food
and Agriculture Sector
How is Math used in Cyber Security?
We get so caught up in our media streaming, online shopping, and social networking that we forget that nothing happens on a computer without numbers. Every time we post a kitten video, tweet our political views, and tell the world what we had for breakfast, it all boils down to binary code - the numbers '0' and '1'. Maybe one day they'll figure out how to encrypt email using icons and emojis, but for now, we have to surrender to mathematics.
Cybersecurity is no different. When they say the geeks will inherit the earth, does that mean you need a PhD in advanced calculus to save the world from a global financial catastrophe? Let's explore.
How Math is Used in Cybersecurity
Boolean Values: Some computers use a branch of mathematics known as Boolean Algebra. There weren't any computers around during the day of George Boole, its inventor. In fact, several programming languages, including Python, rely on this to craft decisions and responses. Python is a favorite language among the hacking and cybersecurity communities.
Complex Numbers: The branch of algebra known as Complex Numbers, aka Imaginary Numbers, is actually a lot of fun. Here, you get to use the letter 'i', which stands for the square root of -1. It's worth delving deep enough into algebra to get to this topic and have the privilege of using this powerful little tool. Many an algebra or calculus student will tell you how many times 'i' got them out of jams on an exam.
Cryptography: This probably accounts for the most massive use of mathematics in cybersecurity. You know when your bank or email program gives you an option to have something encrypted? That. At its simplest, cryptography is no more difficult than those word puzzles where you are given a sentence that is written in numbers instead of words. Each number stands for a particular letter of the alphabet.
By ferreting out all the uses of 'and', 'the', '-ing', and so on, you can eventually decipher the entire sentence. Hackers and information systems analysts use sophisticated equations and mathematical structures to encrypt information.
Scared Yet?
You needn't be. On the popular television program, NCIS, cybersecurity expert Tim McGee has degrees from MIT and Johns Hopkins. Never do you hear him and Abby speak mathematics. This is a very strong sign that access to this profession is not restricted to supercharged math academics, although they certainly contribute to the field.
Another healthy sign for the mathematically challenged (or at least disinterested) is that the exam for CompTIA Cybersecurity CSA+ certification has no mathematical questions. No equations, no calculations, nothing. There is one question that refers to regression analysis, so it is, apparently, at least marginally important to know what that is. Take a moment to look it up. That's one-quarter of a question under your belt, should you decide to pursue that route to the information security field.
The actual test involves a maximum of 85 questions answered within a 165-minute time period. Questions are a mixture of multiple choice and performance. Here, you may need to demonstrate some mathematical savvy.
Preparing Yourself for a Career in Cybersecurity
Perhaps the strongest signal that the cybersecurity field is open to non-mathematicians is the fact that there is no high-level mathematics on the curriculum of most cyber security degree programs. These courses generally teach you how to:
Protect data and teach your colleagues how to protect your company's information systems
Perform vulnerability analysis and penetration testing
Monitor and defend computer networks
Create basis security policies and procedures
The bottom line is, while clearly in an area with mathematics at its root, the more you know, the better. For the hard stuff, the academics do most of the heavy lifting. If you are seriously interested in joining the ranks of the cyber warriors, that path is open to you.
What Kind of Math is Used in Cybersecurity?
Most entry-level and mid-level cybersecurity positions like cybersecurity analyst aren’t math intensive. There’s a lot of graphs and data analysis, but the required math isn’t particularly advanced. If you can handle basic programming and problem solving, you can thrive. Here’s what you need to get there.
INTERESTED IN BECOMING A CYBERSECURITY ANALYST?
layer
A cybersecurity analyst scours a company’s programs, applications, security systems, networks, and more to identify any defects or flaws that could leave this information vulnerable. Learn more about why this role is an excellent launchpad for loftier career goals with job titles like cybersecurity architect, solutions implementation engineer, and cybersecurity engineer.
Binary Number Theory
Binary math powers everything a computer does, from creating and routing IP addresses to running a security client’s operating system. It’s a mathematical language that uses only the values “0” and “1” in combination.
Computer networks “speak” in binary, so cybersecurity professionals need to understand how it works. Fortunately, most computer science courses introduce students to binary as part of the curriculum.
Boolean Algebra
Boolean algebra is used extensively in computer programming. It’s a kind of algebra that describes logical operations using two values, “true” (represented by the digit 0) and “false” (represented by the digit 1). Boolean algebra manipulates those values using the logical function AND and OR.
Unlike other forms of algebra, Boolean doesn’t involve any numerical calculations. The answer is either “yes” or “no,” which is why it’s been so useful in computer coding.
Most cybersecurity training programs, including edX’s Cybersecurity Fundamentals MicroBachelor’s program from NYUx, require you to have some knowledge of programming languages like Python or Java.
Computer/Electrical Engineering
Qualifying Four-year undergraduate degree or equivalent in any of the disciplines listed below:
Computer/Electrical Engineering: Computer Engineering, Control and Automation Engineering, Control Science and Engineering, Control Theory and Control Engineering, Electrical and Computer Engineering, Electrical and Electronics Engineering, Electrical Engineering and Automation, Electrical Engineering Power and Automation, Electrical Engineering, Electrical Power and Machines Engineering, Electronic Engineering and Applied Electronics, Electronic Engineering Optoelectronic Technology, Electronic Engineering, Electronic Science and Technology, Electronics and Communication Engineering, Electronics and Information Engineering, Electronics and Instrumentation Engineering, Electronics and Telecommunication, Engineering Electronics and Telematics, Information and Computer Engineering, IoT Engineering and Information Technology, IoT Engineering, Mathematics and Information Engineering, Network Engineering, Network Security, Semi Conductor Systems Engineering and Computing, Telecommunication Engineering.
Computer Science: Circuits and Systems, Computer Applications, Computer Science and Artificial Intelligence, Computer Science and Engineering, Computer Systems and Engineering, Computer Technology, Cyber Security, Information Security.
Computer Systems Engineering: Computer and Network Technology, Computer Science and Systems Engineering, Computer Systems Architecture, IoT Engineering and Information Technology, IoT Engineering, Network Engineering, Network Security, Systems Design Engineering, Systems Engineering and Computing Systems Engineering.
Engineering Physics: Aerospace Engineering, Applied Physics, Biophysics, Computer Engineering, Electrical Engineering, Materials Science, Mechanical Engineering, Nanotechnology.
Information Technology: Artificial Intelligence, Computer Science and Information Technology, Computer and Network Technology, Fundamental Informatics and Information Technologies, Electronics and Telecommunications, Electronics and Telematics, Information Security, Information Systems, Management Information Systems.
Math Sciences: Applied Math and Computer Science, Applied Math, Information and Computational Science, Information and Computing Sciences, Information Engineering, Information Management and Information Systems, Mathematics and Computing, Operational Research
Why Pursue an MS in Mathematics Information & Security?
Information Science involves the processing, storage, and communication of information. Information theory answers two fundamental questions in the communication and storage of information: (1) what are the limits on information compression (the entropy) and (2) what is the best rate of information (the channel capacity). The theory, initially developed to answer fundamental questions in communication theory, has now expanded to be an integral component of many other subjects, including computational science, signal processing, and information security.
The field is at a crossroads. Although mathematical theory has been instrumental in catalyzing the advances in collecting and analyzing the quickly-growing, vast amounts of complex data, the practitioners need to be taught both how to directly manipulate data themselves and the underlying theory that is its genesis. Moreover, the data needs to be analyzed rigorously and securely.
Most data science and information technology programs tend to provide only a weak theoretical grounding before presenting the tools of the day. Students in those programs are essentially tracked into a trade, with no path for technical growth beyond what they’ve been taught. Our program remedies this issue: students learn solid theory, cutting-edge research, and how to apply these to actual data problems so they can continue to grow in their career. Graduates of the program are uniquely qualified to for positions including information research scientist, data scientist, data analyst, computer systems analyst, information security analysts, operation research analyst, and more.
Subscribe to:
Posts (Atom)
Recent Posts
Popular Posts
-
There is a severe shortage of qualified cybersecurity professionals. The demand for employees at every level is high, and every indication i...
-
The role of mathematics in a complex system such as the Internet has yet to be deeply explored. In this paper, we summarize some of the im...
-
We get so caught up in our media streaming, online shopping, and social networking that we forget that nothing happens on a computer without...
Blog Archive
-
▼
2023
(9)
-
▼
January
(9)
- Statistical Classifiers in Computer Vision
- How does image recognition work for humans?
- Deep learning
- Mathematical and Statistical Opportunities in Cybe...
- A complete guide to math in cybersecurity
- How is Math used in Cyber Security?
- What Kind of Math is Used in Cybersecurity?
- Computer/Electrical Engineering
- Why Pursue an MS in Mathematics Information & Secu...
-
▼
January
(9)