The requirements engineering process has been criticised for its immaturity. Firstly, in the
context of safety-critical systems, missing, misunderstood, and erroneous requirements have
been attributed as the cause of many safety-system faults; and secondly, in the context of
project success factors, many IT projects have identified requirement defects as a primary
cause of being over-time or over-budget. Ambiguity is a requirement defect that is
commonly associated with challenged IT projects, however there are but few empirical
studies on how ambiguity can be reduced or eliminated from requirement specifications.
Eliminating the ambiguity inherent within a requirement specification is the seemingly
unattainable ambition of the systems engineering zealot. This is because ambiguity is
considered an unavoidable side-effect of using natural language, and most requirement
specifications are written in natural language. One proposed solution to the ambiguity
problem is to express requirements in Controlled Natural Language (CNL). CNLs enforce
grammatical and/or lexical constraints to reduce the inherent ambiguity of natural language
without sacrificing correctness, readability, or expressiveness. There is, however, a view in
the literature that CNLs are overly restrictive and unnatural to read and write. Furthermore,
the design and development of CNLs is both labour-intensive and time-i ntensive.
This thes1s describes how a requirements spec1fication can be automatica11y re-expressed
in a way that significantly reduces its lexical ambiguity, without significantly reducing its
correctness or conventionality. The thesis specifical1y focuses on lexical ambiguity, since
this is the fom1 of ambiguity most attributable to the lexicon used to express the
specification. 111e tem1 re-expression is used to di stinguish this approach from that of CNLs,
since the lexicon is not static, but is optimally selected on a word-by-word basis such that
lexical ambiguity is minimised, whilst correctness and conventionality are maximised.
Fundamental to the optimal word selection is a new concept: replaceability(W1, W2) , which
is the degree to which word W 1 can replace word W2• The replaceability equation developed
within this thesis is a function of semantic similarity, polysemy, frequency, and lexical
We implement a software prototype, and execute it on an existing industry-specification.
A controlled expe1iment is used to measure the effects of the re-expression in terms of
correctness, conventionality, and lexical ambiguity. Data are collected from project
stakeholders using a questionnaire-style approach, and hypothesis testing is used to decide
whether or not the optimal re-expression has significantly reduced lexical ambiguity without
significantly reducing correctness or conventionality.