(models, semantics)
logic &
complexity & metrics category
wholes & levels
truth & reality
components & datatypes
Location: By Ulf Schünemann since 2001. Please mail any comments.

Complexity And Other Software Metrics

  1. Information Classification & Measurement (entropy, transinformation, manifest vs. latent vs. destroyed information, ...)
  2. Taxonomy of Structural Software Metrics (size, length, complexity, cohesion, coupling)
  3. Computational Complexity and Power Classes
[SuB] Norbert Bischof: Struktur und Bedeutung: Eine Einführung in die Systemtheorie; Verlag Hans Huber 1995.
[PBSEM] L C Briand, S Morasca, V R Basili: Property-Based Software Engineering Measurement; IEEE Transactions on sofware engineering; 22(1); Jan. 1996.

CHECK: A Data Model for Object-Oriented Design Metrics (1997)

Johann Eder, Gerti Kappel, Michael Schrefl: Coupling and Cohesion in Object-Oriented Systems (1992)

  1. data coupling
  2. stamp coupling: in a call more data is passed (eg a composite object), than necessary - a part would suffice
  3. control coupling: caller determines other callee's logic or vice versa.
  4. external coupling: communication through structured shared data space (shared objects?)
  5. common coupling: communication through shared global unstructured data space
  6. content coupling: external access to object's internal data

Information Classification & Measurement

The classification and measurement of information in (proximate) information theory is demonstrated in the following information system context [SuB 72ff]:
   x   +-------+   y
 ----->| y=x+z |------>
(A)      z |
Two channels (A) and (B) with a 2-bit information source x (values 0,1,2,3) as input, an endpoint y (values 0,1,2,3,4 in (A), values 0,1,2,3 in (B)) as output, and a 1-bit noise signal z (values 0,1).
In (A), where the noise adds to the signal, no information is lost (cf. table below): All of the 2 bit information (entropy H(x)) of x is still contained in y (with 2 1/4 bit entropy H(y)). However, only 1 1/4 bit is still manifest (the transinformation T(x,y)), the remaining 3/4 bit has been turned into latent information (IL(y)) by the noise. This part of the information cannot be attributed to either x or the noise. The latent information about x can be recovered if the noise signal is known (see below).
In (B), where the noise is multiplied onto the signal, 1 bit information is lost irreversibly (cf. table below), since a noise of 0 maps results in an output y of 0 for any signal x: Of the 2 bit information (H(x)) of x, 4/5 bit are manifest in y (T(x,y)), and 1/5 bit is latent (IL(y)).
   x   +-------+   y
 ----->| y=x*z |------>
(B)      z |
in bits (rounded)
(A)   (B)
H(x) entropy of x = medium information content 2


H(z) entropy of x 1  


H(y) entropy of y 9/4


H(x|y) signal equivocation = the receiver's uncertainty what the sender actually sent (``disturbed'' information) 3/4  
H(y|x) signal dissipation (the sender's uncertainty what the receiver will receive) 1  


T(x,y) signal transinformation = information of x manifest in y (``undisturbed'' information) = H(x) - H(x|y) = H(y) - H(y|x) 5/4
H(z|y) noise equivocation 3/4  
H(y|z) noise dissipation 2


T(z,y) noise transinformation. 1/4    
IL(y) latent information = information in y overlapping from x and z = H(y) - T(x,y) - T(z,y). The manifest information is T(x,y) + T(z,y) 3/4  
? information lost unrecoverably 0       1
   x   +-------+   y   +-------+   u
 ----->| y=x+z |------>| u=y-z |------>
       +-------+       +-------+
           ^               ^
         z |_______________|
The channel extended by an information processor reading the channel's endpoint y and the noise z and ``calculating'' output u. The processor recovers the disturbed information, i.e., converts the information about x which was latent in y back to manifest information in u.

Information =/= Organization

MB4 272] Some claim that information measures organization. The argument runs like this: a deterministic system is predicatable, so watching its behavior provides no information; on the other hand a probabilistic system is unpredicatbale, so whatever it does is surprising or informative. Hence, the more organized a system the more informative the observation of it. This argument contains several important mistakes. Firstly, the information function is not defined on a set of systems (or of states of a system) but on system-message pairs. ... Hence it cannot measure the degree of organization of a system nor any other intrinsic property of it. Secondly, ... [t]he predicate "predictable" is not defined on the set of systems but on the set of pairs . Thirdly, prediction must not be equated with deterministic prediction. If a system is stochastic ... then its behavior may be predicted statistically ... In this case we may able to predicat probabilities (of e.g. transitions) and a number of propties derived from them ... Forthly, a stochastic system ... is not chaotic: ... an atom is not chaotic like the top of my desk, over which no probability distribution can be defined.
In conclusion, information does not measure organization. ... What we do have are measures of efficieny ... Knowing that a system is highly (or fairly or poorly) efficient we can infer that it is well (or fairly or poorly) organized to attain a certain goal or to help us perform a certain task. But efficiency is an indicator of organization not a measure of it.

Taxonomy of Structural Software Metrics

From [PBSEM]: A system S = (E,R) consists of elements E and relationships R, subset of E E. Assume that S' = (E',R') and S'' = (E'',R'') are two subsystems of S, (E',E''E and R',R''R). A modular system SM is a system S together with a set M of subsystems whose elements are partitions of E, i.e., eE: (Em,Rm)M: eEm, and (Em,Rm), (E'm,R'm)M: Em E'm = . Then metrics can be classified as follows:

Computational Complexity and Power Classes

Computer science has developed a taxonomy of complexity (computational difficulty) of problems and power of problem solving systems.
In different ranges of complexity are the main working ground of some fields of computer science (of course, a field is not strictly limited to a fixed range). The background color in the table should given an approximate impression where a field's main area is:
computability complexity theory algorithm theory formal languages
The Chomsky-hierarchy (enumerable, CSL, CFL, REG) for grammars is based on the grammars' weak generative capacity, i.e., on the terminal strings they can produce [
x]. Cf. strong generative capacity, i.e., the produceable parse trees [x].

Text colors: black for complexity/power classes, green for solvers, problems from [Hopcroft+Ullman] or from elsewhere
det = deterministic, ndet = nondeterministic, c.f. = context-free, LC = lambda-calculus
Main reference [Hopcroft, Ullman: Introduction to automata theory, languages and computation; 1979]
also [F Henglein, H Mairson: The Complexity of Type Inference for Higher-Order Typed Lambda Calculi; POPL'91]
also [N Immerman: Languages That Capture Complexity Classes; SIAM Journal of Computing 1987]
also [A Arratia-Quesada, S Chauhan, I Stewart: Hierarchies in classes of program schemes; Journal of Logic and Computation 1999]
also [I Stewart: Logical definability versus computational complexity: another equivalence]

Cf. the hierarchy at wikipedia and the huge list of complexity classes and results at the Complexity Zoo
RE (recursively enumerable)
coRE, RE
a syntactic class: RE functions can be explicitly enumerated [zoo]

Turing- complete solvers:Turing machines; 2-stack machines; counter machines; RAM; recursive functions; while programs; Chomsky grammars

R (recursive) =REcoRE; PR [cf. Ackermann]
a semantic class: R functions cannot be explicitly enumerated [zoo]

PR (primitive recursive)
a syntactic class: PR functions can be explicitly enumerated

relation to EXPSPACE? (union of DTIME(2n), DTIME(22n), DTIME(222n), etc.)


=NPSPACE; includes NP and coNP
(solvable in polynomial space)

Buechi automata;
second order existential logic with transitive closure [Imm];
NPSA (NPS + arrays) [mentioned in Ste]

NP (includes RP, =RP?)
(solvable ndet'ly in polynomial time)

ESO (second order existential logic) [theorem by Fagin 74 - found in Imm]

(=NP?, includes coRP, =coRP?)

Problem in NP and coNP: integer factorization [wiki]

NLIN (=NTIME(n)?) (solvable ndet'ly in linear time on RAM) [cf. page 12] NTIME(n) (solvable ndet'ly in linear time on TM) [cf. page 12] CFL (context-free)
(below NTIME(n) and CSL[cf. page 12], unknown rel. to NL; diagram in wiki shows CFL below NC)
stack machines ("pushdown automata" PDA), context-free grammars (CFG)
non-regular CFL ...
RP (includes P, =P?, sub of coNP?)
Randomized Polynomial time

defined wrt. probabilistic TMs

eg. is-nonprime

(=RP?, includes P, =P?)
eg. is-prime
P (=NC?[wiki])
(solvable det'ly in polynomial time, "tractable problems")
least fixed point logic (first order logic with least fixed point primitive) [Imm];
path system logic [mentioned in Arr+] [less?] NSPS (NPS with stack) [Arr+]
CSL (context- sensitive) = NSPACE(n), ie. solvable ndet'ly in linear space [Kur64]
primitive recursive functions (equiv. to loop); loop programs; context- sensitive grammars (CSG)
REG (regular)
=DSPACE(O(1)) ie. constant space, =NSPACE(O(1))
finite automata (DFA=NFA),
regular expressions,
monadic ESO[cf]
(solvable det'dly in linear space)
L (solvable det'ly in logarithmic space)
NL =coNL
(solvable ndet'ly in logarithmic space)
transitive closure logic (first order logic with special transitive closure primitive)[Imm];
NPS program schemes [Arr+]
NC Nick's Class
(includes NL)
(solvable in (log n)c time on nk parallel processors, eg. parallel-RAM = decidable by uniform Boolean circuits with polylogarithmic depth and a polynomial number of gates)
NL-complete (hardest in NL): GAP - reachability in directed graph [by Savitch 73 - found in Imm]
(n log n)-space, etc, ...  
P-complete (hardest in P, "probably not parallelizable" / "probably inherently sequential") [complete or maybe simpler?] type inference for the simply-typed (first-order) LC
[by Wand 87 - found in Hen+]
evaluating a propositional formula for given values

emptiness of CFGs' languages
CFG membership [wiki]
CVP - Circuit Value Problem: calculate output of given gate in given circuit for given inputs [wiki]
linear programming [wiki]
game of life [wiki]
LZ78 compression [wiki]
NP-complete (hardest in NP, "probably intractable")
coNP-complete (hardest in coNP)
SAT - satisfyability of unquantified formula = evaluation of existentially quantified formula
Vertex Cover; Hamilton Circuit; Integer Linear Programming; Chromatic Number; TSP (traveling salesman); Exact Cover; Knapsack; Partition (of integer lists); Tetris, Minesweeper

			  is a structural subtyping entailed by a set of subtyping constraints
for simple types over latices of base types? [Hen+]
PSPACE-complete (hardest in PSPACE) known to be disjoint from NC; maybe subset of P or NP? type inference in the presence of subtyping over ground & function types
in a simplified form of the ML< type system
[A Frey: Satisfying Subtype Inequalities in Polynomial Space; 1997]
QBF - evaluation of quantified Boolean formulas
(alternating exist, forall - the typical minmax game scenario cf)
Lcs - word-problem for CSLs
EXPTIME-complete best move in Chess or Go on an n n board [wiki]
EXPSPACE-complete (hardest problems solvable in exponential space): Lrex - does a regular expressions with exponentiation denote all strings?
two regular expressions with star represent different languages? [wiki]
harder than any problem ndet'ly solvable in exponential time: is an untyped term typable in the second order polymorphic LC (aka F2)? [Hen+]
the theory of reals with addition
non-elementary time: Is an untyped term typable in Fomega? [Hen+]
Is the normal of two lambda-terms typable in first-order typed LC the same?
[by Statman 79 - found in Hen+]
Evaluation of formula with quantific. over (higher-order) Booleans functions
found in [Hen+]
non-primitive recursive: the Ackermann function
semi-decidable (RE-complete):
The Halting Problem is the canonical RE-complete problem [zoo] - determining, in F<'s type system, whether one type is subtype of another
[B Pierce: Bounded quantification is undecidable; POPL'92]
Lu - pairs of TM encoding and accepted word
PCP - Post's Correspondence Problem
ambiguity of CFGs, inherent ambiguity of CFLs
CFG accepts everything,
intersection of the languages of two CFGs is empty / is c.f.
inverse of CFG's language is c.f.
CFG's language is regular
non-computable - cf. hypercomputation:
TMs with oracles, and some forms of Quantum computers, ...
Arithmetic hierarchy: The halting problem for n sits in n+1
Ld - the diagonal language of TM encodings
Le, Lr, Lnr - emptiness / recursiveness / non-recursiveness of a TM's language
TM accepts everything, accepts exactly one word, accepts a regular set

Cf. Design and Analysis of Algorithms with some complexity examples.

For some of the (shown and further) complexity classes it has not yet been proven whether they are in fact separate or only appear so. Especially, the possibility of ``P = NP'' is repeatedly considered, and there are several results on the implication should it be true, but no proof put forward so far has been accepted by the scientific community (it would reclassify NP-complete problems from intractable to P). ``NP = NL?'' (ndet. polynomial time v. ndet. logarithmic space) is discussed here.

Further classes:

One of the few problems that provably need more than polynomial run time is the truth decision algorithm for statements in Presburger arithmetic.
Ulf Schünemann 210601