Human-Machine Interaction - Evaluation View - Measurement - Example1

    Attention : Script is not a complete representation of the oral lecture !!! 
Script is not yet completely finished !!!

AUTHOR: Gerd D´┐Żen-Henisch
EMAIL: doeben_at_fb2.fh-frankfurt.de

Evaluation View - Measurement - Exercise 1

After a first introduction the students have run a first observation experiment (cf.  Exercise Nr.1 ).   Here we will discuss the results of the experiment.

Cigaret Machine - Ease of Use

In one experiment a student has observed the handling of 5 different cigaret machines. Every machine M had its own user interface (UI) which could be characterized by some observable properties (P). The user has tried every machine and after his trial he had to express his 'feeling about the handling' with a number (n) between 1 and 10. The number was interpreted as a 'degree of  the Ease of Use'. '1' was a label for the smallest degree of Ease of Use  and '10' was interpreted as the highest. 

With regard to the presupposed theoretical framwork of measuring we can say that the target object (TO)  were the feeling of the subject correlated with a  sequence of interactions with  the user interface of the cigaret machine. The reference object (RO) was a finite set of numerals  [1,10] with the interpretation, that these numerals represent the beginning of the sequence of integers and that therefore the numerals can be handled like integers, also with regard to the ordering relation '<'.  Thus it is assumed that  the subject of an experiment is interacting with the usert interface and while he is perceiving this user interface  he has certain feelings which are interpreted as representations of the theoretical term 'Ease of Use'.  It is furthermore assumed that these feelings have different kinds of 'intensities' which can be related by the subject to a numeral as part of a finite set, which can be ordered with regard to the represented intensities. Thus, if the subject 'feels very bad' during a sequence of interactions then he can label this feeling e.g. with the numerl '3'.  If he feels 'very good' he can perhaps label this feeling with the numeral '9'. Empirically the subject will rate the label '9' 'higher' than the label '3', i.e. 3 <e 9.  With such a procedure it is possible to rank order all the tested machines with regard to the term 'Ease of Use'.

experiment 1: cigaret machine
Figure: Cigaret Machine; Property 'Easy to Use'

Some questions can be raised here:

  1. How 'stable' is this measurement procedure with regard to the differences between the labels 1 to 10? Would less numbers be more accurate?

  2. How stable is this measurement procedure between different subjects? Would the same task with the same machine produce labels which are 'similar' within a deviation of x?

In a next step one can use these measurements and one can try to find some 'hints' which of the observable properties {P1, ..., Pk} of a user interface are responsible for 'bad labels' and which for 'good labels'.  This can be reached e.g. by 'isolating' those properties which are 'exclusive' for the machines with the bad labels.

Ticket Machine - Intuitional

In another experiment a student has observed four different subjects interacting with one and the same  ticket machine. The target object (TO) here was the duration of the interaction between the beginning of a task and the end for each subject. The reference object (RO) was a clock with usual time units.  The observation showed that different subjects with the same machine can produce different measured labels. The experimenter has assumed that the labels (= numerals) can be interpreted as rational numbers representing the time points of a clock which can be ordered by an ordering relation '<'. Thus a 'lower' numeral a compared to a 'higher' numeral b can be interpreted in th sense that 'b' represents a longer duration than 'a'. Duration is used by the experimenter to indicate some 'degree of intuitionality': small duration is interpreted for a high intuitionality and long durations as indicating a bad intuitionality.  Thus the measured labels allow then an ordering of the subjects according to the different durations of their interactions. 

Figure: Ticket Machine - Intuitional

This second experiment shows that it can be necessary for the evaluation of a machine M to classify the test subjects according to their individual ability to have an intuition for the usage of a machine.  If one has 5 different subjects S1, ..., S5 which show for a certain machine M1 totally different values, then it would be not possible to evaluate other machines M2, ..., Mk with these different subjects. On account of differences between the subjects themselves it would be difficult to interprete these differences in relation to the   machines under investigation.

This introduces some cases to be tested:

  1. Select through a series of pre-tests groups of subjects (functioning as Equivalence Classes) which show 'similar' results with the same tasks and user interfaces.

  2. Evaluate machines only with regard to pre-tested classes of subjects

  3. Use the measured differences between machines to get hints which kinds of properties are responsible for 'bad Labels'

  4. Try to characterize the pre-tested classes of subjects according to important properties.

  5. Evaluate again machines only with regard to pre-tested classes of subjects, but in this case with classes of subjects which are characterized by tested properties.

  6. Use the measured differences between subjects, which are charactrized by properties, to get hints which kind of human properties can be related to bad labels during interactions with a machine.