What is Consciousness Psychology?

What is consciousness. The continuum of arousal; being awake is not the same thing as being aware. What parts of the brain are important for arousal & wakefulness? What evidence is there for this. Midbrain: Reticular formation, Thalamus: Intralaminar nuclei; animal studies severing the brainstem and midbrain lead to coma. Why is the midbrain important for consciousness. It makes consciousness possible.

Why is the cortex important for consciousness. Provides the content. How do we know if people are conscious? Are people who seem unconscious really unconscious? What are 3 examples of cases?

No definitive way to measure consciousness

1. Vegetative state
2. Locked-in syndrome
3. Anesthesia

Major issue for anesthesiologists; some patients become conscious during surgery but drugs restrict movement and they can't communicate; not always remembered; no routine neural activity monitoring as of now.

What are the 3 key observations about consciousness?

1. Consciousness is related to attention
2. People who seem unconscious may be conscious
3. People who are conscious may carry out high-level cognition without awareness

What (3) evidence supports that consciousness is related to attention?

1. Change blindness
2. Attentional blink
3. Visuospatial neglect

What (2) cases support that people who seem unconscious may be conscious?

1. Locked-in syndrome
2. Vegetative state

What (2) pieces of evidence support that people who are conscious carry out high level cognition without awareness?

1 Subliminal perception
2. High-level processing of neglected material in visuospatial neglect

How does attention affect consciousness. We are not always aware of everything and material not attended to is not consciously perceived e.g. change blindness, attentional blink (miss T2), visuospatial neglect (not aware of a side of space). Locked-in Syndrome & Consciousness. Individuals appear to be in vegetative state but may be fully conscious; due to brain-stem damage (no voluntary movement); may be able to communicate if eye movement is spared e.g. journalist wrote memoir by blinking eye.

Vegetative State & Consciousness - Appear to be awake but show no signs of awareness; requires no reproducible evidence of purposeful behavior in response to external stimulation (e.g. squeeze my hand); adequate test of consciousness. What evidence is there for individuals in a vegetative state being conscious? What do the results imply. Patient in vegetative state asked to perform 2 mental imagery tasks during fMRI scanning (imagine playing tennis, imagine going through rooms of house); showed same activation as controls; implies people can perform high level functions w/o awareness.

Subliminal Perception & Consciousness - Briefly present image of mean boy or nice boy (not consciously aware bc presentation is so quick), then show a neutral image of the boy and judge whether boy is nice or mean; found that unconscious stimuli affected judgements; implies that unconscious stimuli are processed to a high level.

What evidence is there that neglected material is still highly processed in visuospatial neglect patients? What does this imply. Shown images of 2 houses, one normal and one with a fire on the side which the patient neglects; asked if they are the same (says same) and which one she'd rather live in (picks non-burning 17/20 trials); Then shown 2 images of house one normal, one with fire on side patient doesn't neglect (says different and picks non-burning house 100% of time); implies high level processing of neglected material unconsciously.

What are the 3 hypotheses regarding the neural basis of consciousness? What evidence supports each hypothesis?

1. There're specialized neurons that give rise to conscious experience; masked priming
2. Consciousness is a byproduct of particular brain systems that have some other primary function; blindsight
3. Consciousness is a state of integration among distinct brain systems; split-brain

Evidence for specialized neurons: Masked Priming Methods - fMRI study show prime (unconscious bc presented for brief amount of time) where prime is either the same or different than the target word; subjects have to categorize whether the object is manmade or natural; responses were faster if preceded by same word prime.

Evidence for specialized neurons: Masked Priming Results - Prime=unconscious, Target=conscious; bilateral prefrontal & parietal regions are specialized for consciousness; unconscious activated inferior ventral reading circuits; conscious greater activity that ignites parietal & prefrontal lobes (overlap w/ attention network.

Evidence for Consciousness as a Byproduct: Blindsight Methods - Patient w/ no primary visual cortex (det w/ fMRI) and no visual fields asked to navigate obstacles in a hall w/o a cane; performed well, avoided the obstacles in the hall.

Evidence for Consciousness as a Byproduct: Blindsight...How is this possible. Could NOT be based on ventral or dorsal streams...Subcortical visual pathway (retina to superior colliculus) is the source of blindsight.

What is the difference between blindsight patients and neurally intact individuals when navigating. Some brain systems give rise to consciousness when functioning normally; ventral stream=conscious perception (intact individuals); dorsal and subcortical streams (source of blindsight) do not give rise to conscious perception.

Evidence for Consciousness as a State of Integration: Split Brain Methods - Patient w/ callosal thinning/lesions to corpus callosum; asked math questions and patient uses R and L hand to point to answer; right hand shows correct answer then drifts to incorrect answer that left hand shows.

Evidence for Consciousness as a State of Integration: Split Brain Results- Showed that language provides an interpretation of the experience and plays role in conscious experience; communication is key to a normal experience.

Dehaene: Global Ignition Theory General Experimental Approach. Use the same stimuli and compare the unconscious and conscious experience to reveal the neural bases of consciousness.

Four signatures of conscious perception

1. Intense ignition of bilateral prefrontal and parietal regions
2. Conscious and unconscious experiences are distinguished by frontal ERP P300 experiences (attentional blink paradigm)
3. Massive increase in gamma-band activity 300ms after stimulus presentation (neural ignition) in local circuits
4. Synchronization of gamma oscillations across distant brain regions, approx 300ms after stimulus presentation

Intense ignition of bilateral prefrontal & parietal regions - Masked priming task: unconscious-->inferior ventral reading circuits; conscious-->ventral activity that ignites parietal & prefrontal lobes.

Conscious & Unconscious Trials Distinguished by Frontal P300 Results - Compared ERP for detected and undetected T2; visual cortex had similar conscious & unconscious brain waves; prefrontal cortex diverged around 200ms for conscious and unconscious trials, large positive voltage in prefrontal cortex (P3 (P300) wave).

Massive increase in gamma-band activity 2-300ms after stimulus ignition in local circuits: methods
Individuals w/ intractable epilepsy using ECoG (electrocorticography, intracranial EEG): presented face or obj for 16ms followed by blank screen (stimulus onset asynchrony SOA) w/ duration of 16-250 ms followed by a scrambled image; participants report category of object or if didn't recognize.

Massive increase in gamma-band activity 2-300ms after stimulus ignition in local circuits: results
SOA duration influenced awareness of target, smaller SOA=decrease in accuracy; Gamma power modulations in first 20-300ms after image onset depending on whether image was correctly categorized or not; modulation occurred ONLY in ventral cortex (face, object, and house selective regions).

Synchronization of Gamma Oscillations in Distant Brain Regions: Methods - Epilepsy patients, depth electrodes: word or blank presented for 29ms with or w/o mask; report whether word is threatening or non threatening. Synchronization of Gamma Oscillations in Distant Brain Regions: Results - Compared degree of gamma synchronization for masked and unmasked words; found "brain web" more long-distance synchronization for unmasked, seen words.

Parallels between 4 Signatures of Consciousness and 3 Hypotheses


Intense ignition & frontal ERP-->specialized neurons
Gamma-band activity--> byproduct of brain systems
Synchronization of gamma oscillations in distant regions-->state of integration

What is the nucleophile and what is the solvent in the Sn1 rxn?

What else is present in the solvent used for SN1? In what ratio. Acetone 25:75 (water:acetone). Why are 3 deg carbocations the most reactive in SN1 reactions. In SN1 reactions, a carbocation intermediate is formed thus a higher degree means more stability so it's more thermodynamically favorable.

In the formation of the Grignard reagent, is magnesium oxidized or reduced? What is this step called?
Oxidized (goes from 0-->+2); oxidative insertion of magnesium. What happens if water is present during the formation of the product? What is this called. The water will react with the G. reagents faster than the carbonyl target and form benzene and MgBrOH; quenching.

What is used as the solvent in the Grignard reaction and why? ANHYDROUS diethyl ether; volatile so it forms a vapor cloud above the rxn which prevents oxygen gas from reacting w/ the G. reagents and forming hydroperoxides; also helps stabilize G. reagent. Autocatalytic - One of reaction products is also a catalyst for the same or coupled rxn. Where does the rxn initiate in the Grignard rxn Surface of the Mg. Why must magnesium be ground. To get rid of magnesium oxide and expose a fresh surface for the rxn to initiate.

What is the purpose of iodine in the Grignard lab?

1. Indicator-fades as Mg is activated
2. Activator-chemically cleans oxide layer on Mg to increase reactivity

Used during reflux in sep funnel and condenser; CaCl2 + CoCl2 turns pink if wet; equalize pressure; prevent moisture from atmosphere from entering system and allow the rxn flask to be open so gas pressure doesn't build up. Why is additional anhyd. diethyl ether added after the rxn initiates? Why not add before this. To dilute the reaction and prevent the formation of a dimer; not added before bc this reduces the conc of bromobenzene near the surface of Mg, making it difficult to initiate the G. rxn.


How can the formation of the biphenyl dimer be avoided?

1. By adding the phenyl bromide slowly so it reacts only with magnesium; keep concentration low enough to avoid reacting w/ G. reagent.

2. Use dilute sol'n conditions to reduce rate of dimer formation.

Why is the reaction quenched with sulfuric acid? To provide a proton for triphenylmethanol to react with; react w/ unreacted Mg; convert Mg(OH)2 (insoluble) into MgSO4 (soluble) Why is a sodium bicarbonate sol'n used as a wash in the Grignard lab. Probably to neutralize the rest of the sulfuric acid we used earlier.

What is used as part of the drying process in the Grignard lab?

1. Brine
2. Magnesium Sulfate

What solvent is used for the Grignard reflux. Anhyd. diethyl ether. What solvent is used for the simple distillation in the Grignard lab? Why? Hexanes; product is less soluble in hexanes than the other materials present in organic layer with the product.

How is the temperature maintained to the BP of the solvent during the Grignard reflux? Heating mantle and Variac is used; assume it was maintained. Why is hexanes used for simple distillation in G. lab Because triphenylmethanol is less soluble in hexanes thus can be isolated; also other impurities such as biphenyl, benzene, and unreacted starting materials that are in the organic layer are much more soluble in hexanes so it allows us to obtain a more pure product.

Chromatography in which the substances to be separated are introduced onto the top of a column packed with an adsorbent (as silica gel or alumina), pass through the column at different rates that depend on the affinity of each substance for the adsorbent and for the solvent or solvent mixture, and are usually collected in solution as they pass from the column at different times.

Preparing the Sample (Bixin)

1. Add 10% Et/DCM to RB from last week
2. Use pipette to dissolve as much extract as possible

Why must the sample be dissolved before adding If the sample isn't dissolved, it will dissolve slowly and contaminate the column. Why must the sample be loaded in a narrow band for column chromatography To get good separation and ensure the test tubes have a good concentration (wider bands have a lower concentration).

The silica slurry should be added to the column using _________ and should be added _ly. A powder funnel; quick. Why must the solvent level stay above the top of the silica To avoid cracking. Why is it important not to have cracks in the silica gel It allows compounds to flow through the cracks instead of separating through the silica gel.

After adding the sample, 2 mL increments of solvent should be added until when Until the solvent over the silica gel is colorless, then the remainder of the solvent should be added. How should the extract be loaded into the column With a glass pipette along the top in a circular motion to avoid disturbing the top of the silica gel.

What order should the different colors come out? When should collection start? When should it stop Yellow-125 mL Erlenmeyer, Intense orange-red bixin band-test tubes, Stop after this. Which fractions should be used for testing purity The one right before the orange-red started eluting to the fraction where it stopped; spot 1 in every 3 fractions. How do we determine which fractions to combine By using TLC plate results to determine which samples are pure.

When should the final TLC plate be done? What samples go on that plate. Before Rotovap; 4 lanes: initial residue, standard, combined, and conc. combined (4 spots). Where should excess sol'n from the column be disposed of C,H,O Halogenated. Where should unused fractions be disposed of C,H, O Halogenated. How should the silica gel in the column be disposed of Invert the column 1 inch above 4 empty weigh boats and allow to sit. Where should extra silica gel be disposed of The waste silica gel container.

What are the four methods of characterization used in lab?

1. TLC
2. MP
3. Visible Spectroscopy
4. NMR

What are the pros and cons of TLC?

Pros: fast/easy, ID & purity
Cons: Nonidentical compounds can have same Rf, requires standard, not quantitative

What are the pros and cons of MP?

Pros: Classical method, some info on ID and purity
Cons: Contaminated or diff compound? Requires lit value, no way to det amt contamination

What are the pros and cons of visible spectroscopy?

Pros: info on electronic structure
Cons: more difficult to ID compounds, v difficult to ID and quantify impurities bc of overlapping peaks.

What are the pros and cons of NMR. Pros: quantitative, can be used to ID new compounds. Cons: Machines are expensive, more technically challenging to run samples, requires ability to interpret new spectra.

Template-Matching as an explanation of pattern recognition in HUMAN ?

In contrast to Wundt s program, behaviorists wanted to focus on External observable behaviors. (Behaviorists believe that mental processes do not exist). Noam Chomsky's work in linguistics was crucial to the birth of cognitive psychology because It showed how complex and creative human language was, and how inadequate behaviorist theory was to explain it. (Noam Chomsky rejected the purely behaviorist explanation of human language as " verbal behavior". He emphasized the novelty of human language and the internal rules for language use).

According to Sternberg's analysis of cognitive processes, a subject has to Encode the stimulus; compare it with the memory set; then decide and respond ( RT = Encoding + Search + Decision + Response These are the 4 operations/ processes that the subject is assumed to go through).

In explaining cognition, the connectionist approach emphasizes interactions between individual processing units in the brain (Cognitive processes occur in parallel and distributed across multilevel of units...is correct because "interaction" means that there is bidirectional influence between low and higher levels of units). In response to a difficult question, the participant is likely to respond more slowly than if an easy question had been asked. In terms of the overall response times, the difficult question would yield Response times with higher numbers.

Which of the following is NOT an assumption of a STRICT information-processing approach?

Parallel processing ( Process model assumes
Total RT = sum of the duration for each independent stage
•stages of processing is in a fixed sequence, one at a time
•stages are functionally independent, no overlapping)

These processes involve conscious processing, conscious awareness that a task is being performed, and usually conscious awareness of the outcome of that performance. Explicit

According to the textbook, which of the following is the least likely to be a feature of performance on a visual search task that leads to the apparent pop-out of the target from the background distracter items Serial search. The ability to attend to one source of information while ignoring or excluding other ongoing messages around us Selective attention.

The attention process that keeps a person from searching the same place over and over Inhibition of return. Masking A later visual stimulus affecting the perception of an earlier one Auditory information endures in auditory sensory memory for two to four seconds (depending on the stimulus material) after which it fades away (decays). How long does visual sensory memory last for no more than 250 milliseconds. Which of the following is NOT a Gestalt grouping principle. Reduction.

The disruption of auditory span when there is a sound following the end of a series is called Suffix effect. Change blindness is associated with Inability to pick up changes that occur during saccades. Which is NOT true of behaviorism. The first major school of thought in experimental psychology. Which of the following was NOT a challenge to the behaviorist approach. S-R learning. Most associated with the "Method of Savings" Hermann von Ebbinghaus.

Which of the following is NOT a fundamental assumption of cognitive psychology. Introspective methods allow useful cognitive insights. Wrote a review of Skinner's "verbal behavior." This review clearly illustrated the shortcomings of the behaviorist account of language .Chomsky.

Which of the following is a common analogy used by cognitive psychologists to describe or characterize how people think? Digital computer.A _____ model is a hypothesis about the specific mental processes that take place when a particular task is performed. Process. The top layer of the brain, responsible for higher-level mental processes. Neocortex

A computer-based technique for modeling complex systems. Knowledge is represented by the strength of the excitatory or inhibitory connections between massively interconnected nodes. The kind of processing that is heavily reliant on information from the environment Data-driven.

If I present a name for 50 milliseconds (on a computer screen) and then replace the word with a picture of a New Orleans celebration on top of the location where the word used to be, this is most likely to be part of a study investigating Backward masking (a later visual stimulus can drastically affect the perception of an earlier one).

Context effect Influence of surrounding information and your knowledge on processing; Given that the picture showed up later, instead of at the same time as the name, the task is not studying context effect. Decay A passive process like fading (a forgetting mechanism).

Which of the following best illustrates "top-down" processing? Warren and Warren's (1970) phoneme-replacement studies. (The same speech segment "*eel" was recognized as different words (e.g., heel, peel) to fit in the context of the sentence determined by the last word (e.g., shoe, orange). Thus, it demonstrated that perception and identification of speech are heavily dependent on context, on top-down processing). Pandemonium model A feature-based model that assumes completely bottom-up processing. Template-matching models of visual or auditory processing Cannot account for top-down processing.

Consider your recognition of the letter "A" in this particular letter sequence: "ANT". Which of these statements describes the role of bottom-up processes (a.k.a data-driven processes). You recognize the letter "A" because of the two slanted lines and cross-line. (Bottom-Up Processing (data-driven processing) means that the processing of the stimulus is guided largely by the lower-level data, such as features and elements in the pattern itself--the slanted lines and cross-line in the recognition of "A".

All the other choices are example of top-down processing, in which the processing is guided by the higher-level knowledge already stored in memory, e.g. the word "ant", coherent phrase or sentence of certain topic...etc)

Template-Matching as an explanation of pattern recognition in HUMAN Is not satisfactory because human pattern recognition is so flexible while template matching is rigid.

Visual information endures in visual sensory memory for about 250 milliseconds after which it fades away (decays). How long does auditory sensory memory for spoken language or other complex sounds last for. About 2-4 seconds According to Treisman's Attenuation Model, which of the following would a research participant most likely be able to hear if it occurred on the unattended channel in a shadowing task?
The name of the street on which the research participant lives

Which of these describes the Stroop effect. naming the ink color of the word is especially difficult if the color conflicts with the words. The early filter theory of selective attention (Broadbent) suggests that when several sensory stimuli are presented the research participant can switch to one or another channel. ( According to Broadbent's Early Filter Model - unattended messages are NOT processed, except in terms of physical properties. only attended messages are processed. channel shifting is possible and it takes time)

In visual search - The kinds of distractors can critically influence search rates (Conjunction searches are generally slower than disjunction searches; because conjunction search is serial and conscious, while disjunction search is parallel and automatic)

The kinds of distracters can critically influence search rates because it will affect whether you are search for ONE simple feature (disjunction search) or a combination of two features (conjunction search)

How to find out to what patterns a CNN is sensitive?

Like those of the first layer, are the weights of the later layers self-explanatory? Why? They are not, since they created weighted recombinations of features of earlier layers.

Deep Dream:

1) Train the CNN to recognize the images.

2) Fix the weights and take the partial derivatives with respect to the input and optimize it with gradient descent.

The result is the input that creates the desired output activation. The patterns to which the CNN is sensitive evolve in the input to make it fall into the target category.  How to get a reconstruction of the receptive field of neurons in a feature map?

1) Feed a huge amount of images into the Convnet.

2) Select the images that cause strongest activation of a particular feature map.

3)Approximate the pattern that caused the strong activation in that feature map with the help of a reverse Convnet (a.k.a. "Deconvnet"):

3.1) Unpool the feature maps.
3.2) Convolve unpooled feature maps with transposed filters of the Convnet.

How does the unpooling of the feature maps work in the "Deconvnet". The activations are mapped back to the memorized (during the the forward step) positions of maximum value. How does transposed convolution work in the "Deconvnet". It works by convolving a sparse feature map, which has been created by the Unmax layer, with the transposed kernel of the Convnet. What is the goal of dimensionality reduction? What is it good for. Its aim is to reduce high dimensional data onto a lower dimension while preserving the structure in the data. It is mostly used for visualizing data by reducing the high dimensional data onto portable two dimension.

What is t-SNE? How does it work. The t-Distributed Stochastic Neighbor Embedding measures pairwise similarities between points. It does so by measuring the probability of neighboring points to fall within e.g. a Gaussian with the selected point as mean, and that for each point (use Kullback-Leibler divergence). Do so once in higher dimensional space and once in lower dimensional space (minimize the difference between probability distribution by adjusting the position of points in lower dimensional space with gradient descent). The outcome is different for different runs.

What is the Kullback-Leibler divergence. It is a method that measure the difference between probability distributions. What is perplexity. It is the amount of expected close neighbors. Low perplexity implies a local structure whereas high perplexity implies global structure. Are the relative differences in cluster sizes representative in t-SNE. No, they are not.

How do we find the right depth and width of a ANN? What are the limitations?

By (1) expanding, testing and comparing performance, and by (2) finding balance between over- and underfitting.

The limitations are (1) the information in the data and (2) the computational power.


What general differences are there between shallow and deep ANNs. Deeper models tend to perform better as they add more parameters, while shallow models start to overfit. Describe the "vanishing gradients" problem. The gradients exponentially decrease (and, eventually, vanish) while the error backpropagation through the layers. As a result, the first layer does not learn and keeps its random initial values: the first layer "kills" all of the signal. This is especially a problem for very deep networks. Why is initializing the weights with larger value not a good way around the vanishing gradients problem. This would just cause the derivative of the logistic function to shrink alongside the gradient. One good alternative would be to pre-train the network. On what parameter(s) does the stability of the gradients depend? It depends on (1) the derivative of the activation function and (2) the values of the weights.

On what parameter(s) does the network's update rule depend. It depends on the gradient. On what parameter(s) does the network's "training speed" depend. It depends on the size of the gradient (large gradient = high training speed). What do we understand by "appropriate" biases? When and in what order should they be learned? What effect does it have on learning. We need the biases to shift the data into an undecided state (regarding the activation function). Those need to be learned before the learning starts as well as sequentially, starting from the second layer. This should cause to slow down early learning.

In the case of the rectified linear unit (ReLU) function, what if the backpropagation step causes to map all possible inputs to a negative drive. In this case, the ReLU gate never fires: the output and derivative are 0. The error cannot flow through the ReLU. Its weights won't be updated anymore. The ReLU is "dead". This can be avoided by using a small learning rate and slightly positively initialized biases.

What are the advantages/disadvantages of ReLU compared to tanh in very deep networks. While tanh is subject to the vanishing/exploding gradients problem (if the has not been pre-trained) and its activation and derivative are expensive to compute, ReLU makes the network trainable without pre-training and has a very easy activation and derivative.

However, the mean activation of tanh is 0, while that of ReLU is > 0: for ReLU the biases need to be adjusted first. Also, if the backpropagation step causes to map all possible inputs to a negative drive, then the ReLU gate (unlike the tanh one) never fires; as the output and the derivative are 0. The error then cannot flow through the ReLU and the weights won't be updated anymore: the ReLU is "dead".

How do we prevent the dying ReLU problem from happening. We prevent it from happening either by using a small learning rate and slightly positively initialized biases, or by using a leaky ReLU instead of ReLU.

What are the advantages/disadvantages of ELU. It saturates for negative values only, which makes it noise robust with no penalty for highly positive values, and its mean activation is close to 0 (the gradients for the biases are stable).

It is however expensive to compute. How do we adjust the biases for ReLU. With batch normalization. Regarding the distribution of the data, how can we speed up learning. By centering the data around 0 by subtracting the mean of each dimension of the training data. This would place the data around the undecided state of the activation function and hence speed up the learning.

Is an initialization of 0 for all the weights a good idea. Although such an initialization is clearly unbiased, it would cause all the outputs, hence all gradients and therefore all weight updates to be identical. How do we tackle class imbalances. If the differences are not very large: draw balanced mini-batches. Sparse classes are then shown more often. If the differences are grouped in different classes: draw balanced mini-batches from different subgroups. It then shows larger groups more often. For very unbalanced classes: weight the loss. The loss for misclassified samples of small classes is then increased. Are learning rate and mini-batch size dependent? Yes they are.

Smaller mini-batches: smaller learning rate. As a result of this, cross entropy decreases (almost) linearly and accuracy reaches an early plateau. Larger mini-batches: larger learning rate. As a result of this, cross entropy explodes (possibly falls and zigzags) and accuracy is zigzagging heavily. How do we recognize a good learning rate in terms of cross entropy and accuracy. Cross entropy decreases steadily and accuracy increases steadily.

How do we find a good learning rate?


1) Start with large learning rate.
2) Divide learning rate by 2 (or 5).
3) Retrain the ANN and validate the performance.
4) Continue with 2) if performance increased.

How do we measure correlation in visual data?

How do we measure correlation in visual data?


Use the dot product as a measure of how much of a signal is present in another signal. Use the sliding dot product as a measure of how much of a local signal (a.k.a. features / patterns) is present in another signal. They are represented by the weights and are learned via backpropagation. How can higher order features be represented in CNNs? What would be their input. They can be represented by the weights within additional hidden layers. They would receive preceding, lesser order features as input. Are feed-forward neural networks biologically plausible. No they are not biologically plausible. Their neurons are highly specialized with very complex receptive fields and receive all of of the input.

Are CNNs biologically plausible. They are more biologically plausible than FFNNs:

- Processing is divided into several consecutive layers.
- Neurons in each layer combine their inputs to higher order features.
- Neurons process only a small subpart of the available information.
- Many neurons perform the same task for different parts of the input.


What is cross-correlation? How can it be used in CNNs. It is the sliding dot-product of an input and a kernel. It can be used for pattern-matching in CNNs. What is the difference between "same" padding and "valid" padding. In "same" padding the output size is equal to the input size, whereas in "valid" padding the border is dropped, resulting in a smaller output size.

What is the difference between cross-correlation and convolution. Convolution is the same as cross-correlation with a kernel that was rotated by 180 degree.

How can we efficiently reduce dimensionality in categorization tasks?

- Higher stride level.
- Max/Average pooling.

What is max pooling? What is its main downside. It consists in selecting the strongest neuron's signal only. The exact position of the feature is then lost. What is average pooling? What is its main downside. It consists in computing the average neuron's signal strength. The exact strength and exact position of the feature are then lost.

How do we compute the amount of floating point operations that have to be performed in order to transform an input to an output?

Number of floating point operation = Number of output feature maps/neurons x Number of input neurons x (Number of multiplications for a single weight + Number of sums for a single weight) + Number of biases (= Number of output neurons)

How do we compute the amount of degrees of freedom of a convolutional layer/fully connected layer. The amount of degrees of freedom is equivalent to the amount of weights and biases:

Amount of DoF in convolutional layer = Each value in each of the kernels + One bias for each feature map. Amount of DoF in fully connected layer = Amount of connections between the two fully connected layers + One bias for each neuron in the second layer.

What is cognitive psychology?

What is cognitive psychology?
Cognitive psychology is the science of mental function. It is concerned with the basic processes of learning, thinking, perceiving, acting, and feeling.

What is meant by the phrase, "It's a Jungle In There?"

"It's a Jungle In There" applies Darwin's insights to the inner workings of individual brains. The main idea is that the mind reflects competition and cooperation within the brain, much as Darwin's theory assumes competition and cooperation among species in the outer environment

What is the main problem the "jungle principle" is meant to solve?

The main problem it is meant to solve is, Who decides things? A system of distinct elements that compete and cooperate can yield output that seems to suggest some inner mental executive who makes choices. No such executive is needed according to the jungle principle.

Why does the book stress the idea that mental "elves," "imps," and "demos" are just metaphors?
There are not actually little creatures in your brain with their own agendas. There are, instead, very simple, dumb elements cooperating and competing with one another. They know nothing. If they did, you would be faced with the problem of an infinite regress.

What is a neural niche?
Different neuron ensembles have different preferred stimuli. They fire more ('"shout louder") when their preferred stimuli are presented. For example, they may have a higher firing rate for vertical lines, horizontal lines, right angles, curved lines, etc. In other words, different neurons are tied to different functions.

What is the infinite regress problem discussed on pages 2-3?
By attributing what you know to homunculi ("creatures" that know and direct things), in the brain, you create a continual, circular problem. If you know what you know because little creatures in your head told you so, who told those little creatures? Other little creatures told them. And who told those little creatures? . . . This goes on forever, an infinite regress. Not good

What does the word "pandemonium" mean? What is the Pandemonium model? What was it designed to do and how does it work? In what way has it been found to be incomplete or incorrect?
Pandemonium is anarchic—something wild, full of disorder and confusion.
The model refers to a model of perception based on a bottom-up hierarchy of feature detectors. Here, neurons with specific duties receive and analyze features of a stimulus. Lower-level neurons ("feature demons") fire more when sensory input matches their preferences (e.g. curves, vertical lines, angles, etc.). This information is passed on to mid-level neurons ("cognitive demons") that fire when there are matches between their preferred patterns of features and features noticed by the lower-level neurons. Still higher-lever neurons ("decision demons") then decide what the input is based on which mid-level neuron ("cognitive demon") fires the most (i.e., has the most matching features). This model was designed rule out competition—to get to coherent, rational choices from seeming disorganization. However, in the model, communication goes only one way: from bottom to top. In reality, communication goes every which way, directly and/or indirectly.

What are the three reasons to focus on mental functions?
(1) They explain how things work—they map onto the mental and behavioral functions they afford (e.g. smelling is different from seeing). The basic fact of experience suggests that the mind has distinct mechanisms corresponding to distinct qualities of experience. (2) The way we learn is, in general, gradual. (3) Damage to the brain can result in selective deficits.

Who was Ernst Mayr? Why is he important here?
Ernst Mayr was one of the most influential evolutionary biologists of the 20th century. He led the population thinking movement—thinking in terms of groups of organisms, not of individuals. This is important because population thinking lets scientists make use of powerful quantitative tools to analyze groups as a whole. These tools are useful in studies of neural ensembles.

How do neurons signal their "friends" and "enemies"?
Neurons cooperate with their friends. This is expressed mechanistically through excitation. If a neuron is friends with another neuron, it will excite the other neuron. Neurons compete with their foes. This is expressed mechanistically through inhibition. If a neuron is enemies with another neuron, it inhibits that other neuron.

Explain in your own words what is meant on page 7: "The bigger the guns, the fewer of them there can be. This may help why attention is limited."
Competition and cooperation are manifested in levels of control—dominance hierarchies in the brain. The more upper-level dominance a neural ensemble has, the fewer of them there can be. (e.g. six small cars can fit into the same length space as one big semi-truck). There is a bottleneck, a limited capacity.

Why does the book talk about few cars being able to get very close to a toll collector?
This is the idea of limited capacity in consciousness. There can only be about 4-9 thoughts close to consciousness at any one time.

The book includes discussion of someone jumping over a chasm and leads to the statement, "You're many beings." Explain this. Do you agree?
There is no single, powerful dictator in your entire brain. When you act, you act on behalf of untold number or beings within you, who function in ways that may or may not happen to ensure their own survival. You can view yourself as a population made up of beings, none as intelligent as you, but collectively comprising you with no one in charge.

Charles Darwin lived from 1809-1882. Who was he?
Charles Darwin was one of the most important thinkers in the history of Western Civilization. He was the (most famous) originator of the idea of natural selection.

. What is "natural selection" in your own words?
Survival of the fittest. Species that produce offspring tend to survive

. Replication is important in Darwin's theory? Why? How does the story of just one plant and just one animal bear on this?
To keep things going, you need multiplicity, or replication. Having many plants and animals boosts the chances that life continues. With only one plant and one animal, if either one is destroyed, life ceases to exist.

Variation is also important in Darwin's theory? Why?
Variation is diversity, and diversity allows species to be prepared for what may happen

. Selection is also important in Darwin's theory? Why?
Selection is important because it provides the means of choosing members of a species that have what it takes over those who don't.

. The chapter had a section on sex. Why?
Sex is important because it produces offspring. It spreads genes through replication.

What principle is illustrated by Ellis-van Creveld syndrome and imprinting?
The founder effect: the tendency of initial, successful occupants of a niche to have an exceptionally strong effect on succeeding generations. The effect holds when the rate of interbreeding among first settlers and their seed exceeds the rate of breeding with newcomers.

. One might think that if you haven't ridden a bicycle for many years, you'd forget how to do so. Why don't you forget to do so according to the book?
When a niche opportunity arises, a species occupies a new habitat and survives within it with a low population density for a long time. If conditions are hospitable, the species grows. Neural ensembles that support long unpracticed skills (e.g. riding a bike) can 'bide their time' if conditions for their survival are not too unfavorable—that is, as long as they aren't crowded out by other competing elements, they will remain viable

. Do you really use only 1/8 of your brain? What do hydrothermal vent worms tell us about this?
No. As long as healthy neurons are present in an area of the brain, neurons studied there have been shown to be active. No potentially habitable niche goes unoccupied—even in places where you least expect it, like worms living on the hydrothermal vents unfathomably deep in the sea.

What is punctuated equilibrium and how does it relate to the mind?
Punctuated equilibrium is a relatively sudden change in the rate of evolutionary change. There are similar surges to this in mental development. For example, toddlers 18-24 months in age roughly double their vocabulary. Minds jump from state to state, from not understanding to understanding (e.g., not recognizing a shape to recognizing it).

. Who is the Boss in this chapter? Why is the Boss discussed vis à vis the understanding of the mind and brain?
The Boss in this chapter is God. Natural selection is to God, what competition and cooperation in the jungle is to the mental executive. Just as you don't need a divine guiding figure who designs, creates, and kills off species to explain how species originate according to Darwin, you don't need a central executive to explain how thoughts arise or die, how behaviors are chosen or suppressed, or how motives arise and subside. {A side note from your instructor here. Please be aware that the foregoing has the phrase "according to Darwin."}

Provide definitions for these words and identify six more words from the readings that you were unfamiliar with or think others might be unfamiliar with, and provide definitions for them.
Vilified: to criticize or condemn; speak or write about something in an abusively disparaging manner.

Hindsight bias: the difficulty of remembering what it's like to not know something you once didn't know, but do now. Also known as the knew-it-all-along effect.

Theistic: characterized by belief in the existence of a god or gods.

Eschews: to deliberately avoid using; shun, renounce, refrain from.

In your own words, what is meant by the two parts of the statement "The brain is locally global and globally local?"
On a local scale, the brain is made up of essentially the same things (neurons) which do the same things (metabolize, and also receive and send signals). On a larger scale, the various parts of the brain do different things.

Neurons have three parts. What are they and what do they do?
Dendrites receive signals from input neurons. Soma (cell body) integrates signals. Axon sends electrical signal to axon terminal for inter-neuronal communication

In a brief statement in the book, neural pruning is likened to natural selection. Expand on this a bit and if you're really ambitious, do so by following up on Note 3 for this chapter.
Neural pruning amounts to neurons flushing out unnecessary connections. It mainly occurs early in life, concurrent with the formation of myelin. Pruning is similar to natural selection. As in Darwin's theory, natural selection favors organisms that adapt well in an environment. Likewise, neural pruning favors essential neural connections.

Describe in your own words what distinguishes excitatory and inhibitory effects of one neuron on another. How do these effects relate to neural competition and cooperation?
Neurons connected through synapses communicate each other. In synapses, electrically promoted chemical diffusion from pre-synaptic axon terminals reach the post-synaptic zone and bind to receptors. Depending on the receptor types, different neurotransmitters promote electrical excitation or inhibition of post-synaptic neurons. If multiple excitatory chemical inputs converge to a single neuron, they synchronize their excitatory effects, amplifying the "ON" signal of the target neuron. In contrast, if excitatory and inhibitory inputs reach the target neuron at the same time, they compete with each other, making it harder for the neuron to fire

The book refers to a catchy phrase that neuroscientists use. What is it? Explain it in your own words.
"Neurons that fire together, wire together." If neurons fire synchronously with high probability, the strength of their connections tends to increase

What helps makes a neuron's chance of survival good or bad? (See the hamburger example, but make up some other scenario.)
The chance of survival for a neuron depends on its "teaming-up" with other neurons in at least one circumstance. Another example might be a neuron involved in sniffing when odors arise that are important to survival.

The Bell-Magendie Law pertains to two neural niches. What does the Law say? What does it mean? What are the two neural niches? Describe them in technical terms but also in terms of everyday experience. What are interneurons, and does their existence invalidate the Bell-Magendie Law?
The Bell-Magendie law says that fibers on the dorsal side of the spinal cord send afferent sensory signal to the brain, whereas fibers on the ventral side of the spinal cord send efferent motor signal to the body. This principle articulates the clear distinction between neural niches for sensation (input to the brain) and response (output from the brain to muscles). For example, when your finger is stung by a bee, that "pain" is recognized by your brain through the dorsal spinal cord, and you shake your hand via the signal from the brain, through the ventral spinal cord. Interneurons are neurons that transfer signals from sensory to central neurons and/or from central neurons to motor neurons. Their existence does not invalidate the Bell-Magendie Law because interneurons support the division between receptors and effectors

Experiments by Gazzaniga and Sperry are described in the book. Describe the experiments in your own words. If you need to look up "haptic," go ahead and define it in your answer.
In the Gazzaniga and Sperry experiment, the corpus callosum was cut to isolate the right and left hemispheres of the brain in patients with severe epilepsy. Through the experiment, it was found that the left cerebral cortex is used more for language and the right cerebral cortex is used more for intuitive thinking. The word "haptic" refers to touch.

Why did the book talk about an interrupted sewer line? What was the point? Express it in your own words.
The point of the sewer story was to demonstrate how even a simple distraction can disrupt brain activity but not because the sewer line is needed for artistic expression. We can learn which part of the brain is responsible for a particular function by studying brain damage, but the addition of new tasks, leading to distraction or task changes may not be informative

Feature detectors exemplify localization. Why? Building on the discussion of feature detectors from the work of Hubel and Wiesel, suggest possible simple, complex, and hyper-complex feature detectors for sounds relevant to music. You needn't look this up. Invent some possibilities on your own, considering aspects of sounds that would be picked up by simple cells, complex cells, and hyper-complex cells.
Feature detectors are neurons that are inclined to detect certain aspects (features) of stimuli. Simple cells might detect elementary aspects such as specific musical sounds, complex cells might detect chords, and hyper-complex cells might detect a whole chord passage in a song.

What are grandmother cells? What point was being made about them?
Grandmother cells are hypothetical cells that can explain the ability to recognize complex patterns in a wide range of circumstances such as your grandmother's face in a huge range of poses. Yet the grandmother-cell concept has been criticized on the grounds that high-level detectors cannot account for recognition of the same object (e.g., your grandmother) over the infinite number of ways she can appear.

Why did Michael Merzenich cut off the middle finger of a monkey? What did he find? Say this in your own words, but please be sure to also include the word "plasticity" in your answer.
Merzenich amputated a monkey's middle finger to see how the monkey's somatosensory cortex would be affected by the loss of the middle finger. He found that after amputation, the middle finger area of the somatosensory cortex started to become responsive to touch on adjacent fingers. This and other demonstrations helped established the principle of neural plasticity, according to which the functional properties of the nervous system can be reshaped. They are malleable, pliable, or "plastic."

Why did Michael Merzenich have a monkey learn to make fine tactile discriminations with one finger? What did he find (in your own words)? Was the conclusion consistent with or inconsistent with what he found by cutting off a finger?
He wanted to see if neural plasticity would work in other ways besides amputation. He discovered that practice results in enlargements of neural regions serving the practiced task. This conclusion was consistent with what he found by amputating the monkey's middle finger because it showed that after extensive practice, more brain area in the sensory cortex became responsive to touch, much as other areas that continued to receive signals (somatosensory regions for touch of the index finger and ring finger) grew, while areas that did not continue to receive signals (the somatosensory region for touch of the middle finger) shrank. It was not that more neurons suddenly appeared in the more signaled areas. Instead, it was that their responsiveness changed. Neural plasticity is demonstrated both in "bad" circumstances (losing a finger) and in "good" circumstances (practicing touch with particular fingers).

What surprising neural change leads to blind people developing heightened tactile sensitivity as a result of reading Braille?
Areas of their brains that would be responsive to vision become sensitive to touch.

Why did a man who had lost his arm feel his now-absent hand being touched when his face was palpated? (Look up palpated if you're not familiar with that term.)
V. Ramachandran reasoned that because the face and the hand regions of the somatosensory cortex are adjacent, sensory inputs from the man's face might form stronger connections with the area of the somatosensory cortex that used to get touch inputs from the (now-absent) arm. Thus, if touching the face activated that hand region, signals from that region to higher centers would still be interpreted as hand touches.

What is synesthesia and how does it relate to neural plasticity?
Synesthesia is the condition in which one type of stimuli arouses another sensation. For example, people who experience synesthesia have vivid associations between sights and sound. When they see something, they also hear something even if the object does not make any sound. Synesthesia stems from neural plasticity, the brain's ability to change, or reorganize itself, by forming new connections between neurons, including neurons that typically respond only to one sensory modality. In synesthesia, those neurons respond to other sensory modalities. So, for example, neurons that respond to sights also respond to sounds. Transmission of signals from these "double-input" neurons to higher centers gives rise to perception of both sight and sound.

What is the link between neural plasticity and the jungle principle?
Neural plasticity refers to the brain's ability to dynamically reorganize itself, much as the life forms in a jungle evolve over time based on the dynamically changing challenges they face.

18. Define these terms: (a) Aphorism; (b) Raison d'être; (c) Tenuous; (d) Malleability; (e) Somatosensory cortex; (f) Droller; (g) Haptic; (h) Plasticity; (i) Palpate.
(a) a pithy observation that contains a general truth
(b) the most important reason (or purpose) for the existence
(c) very weak
(d) a substance's ability to be shaped into something without breaking
(e) a brain area that processes touch inputs
(f) more unusual, more amusing
(g) relating to the sense of touch
(h) adaptability or ability to reorganize or remold
(i) examine something by touching it

In your own words, what is the problem with postulating a "head honcho" in the brain?
Invoking a "head honcho" in the brain essentially means that we rely on an intelligence or a mind to decide how to manage mental processes. This begs the question of how the "head honcho" makes decisions, and quickly leads to an infinite regress problem. We need to be able to explain mental processes in ways that do not require neurons to "know" what they are doing

The chapter opens with a discussion of the problem one may have trying to filter out unwanted input. What was the main theoretical point here? Describe another example of difficulty filtering out unwanted input from your own experience.
If we are able to filter out unwanted input, then how do we make sure we don't inadvertently filter out input that we actually do want? It is important to be able to distinguish between relevant and irrelevant information, but how do we do that without fully processing each item?

You may have difficulty listening fully to lectures when there are construction noises going on outside. The warning "dings" from trucks when backing up are especially distracting.

In your own words, describe Broadbent's filter theory. Then explain how the "Dear Aunt Sally" experiment worked and why its result disproved Broadbent's theory.
Broadbent argued that we direct auditory attention to just one ear at a time, ignoring input to the other ear. This was disproved with the "Dear Aunt Sally" experiment. This experiment involved presenting two lists of words to each ear at the same time, with a single, meaningful, sentence alternating between ears and the remaining words filled in with numbers whose order wasn't special or meaningful. Because subjects could follow the meaning of the sentence, they were clearly not attending to one ear at a time.

The book commented on Aristotle, Newton, and Einstein. Why? What point was being made about Broadbent's theory and scientific theories more generally?
These well-known figures raised central issues in their fields and, at least in the case of Aristotle and Newton, their specific proposals were proven wrong. Still, their proposals were studied for years and years, and that study led to the development of better theories, so these thinkers are respected. Being wrong isn't necessarily a bad thing.

What is auditory shadowing and what finding provides evidence for a relevant dominance hierarchy?
Auditory shadowing is when one person immediately repeats words spoken by another while they are still coming in. When a person shadows a speaker, they are focusing their attention on that speaker and are generally unable to repeat what the other person is saying, except if the other input is highly important personally, like one's name. The fact that you can hear your name even when it's in an ostensibly unattended auditory channel shows that the neural representation of your name is high in the dominance hierarchy

Paraphrase William James' definition of attention.

Attention is necessary because we are unable to attend to every stimulus in the world. We have to focus on one thing at a time. This involves enhancing the processing for the focused item and the inhibition of processing for everything else.

What are the Rubin figure and Necker cube? What aspects of them make them relevant to the study of attention and what lesson about attention is learned from them?

The Rubin figure and the Necker cube are images, each of which can be perceived in just one of two different ways at a time. Oddly, it is very hard to intentionally see each image in a particular way without having the image's interpretation switch, as if of its own accord. Because the image itself doesn't change (it's printed on a sheet of paper or on a screen), the changes we perceive must be the result of internal processes.

Review Pashler's experiment and finding and say what that finding suggests about attention
Subjects were presented with lights and tones. When they saw a light on the left, they pressed a button with the left hand, and vice versa when they saw a light on the right. When they heard a tone, they were to press a pedal with their foot. When a light and a tone came very close after one another, subjects were delayed in making the pedal press. This suggests that attention is a bottleneck of processing that is not tied to each body part, but limits the rate of response-processing itself.

Brief mention was made of Yarbus and his work on eye movements. Use Google Image to see some results of Yarbus' work and describe what you see via that search, emphasizing the substantive inference that can be drawn

Yarbus showed that people look at images in particular ways that reflect the kinds of information they want to get from what they see. When people are instructed to remember people's identities, they look at faces, but they look at clothes or other parts of an image when they are told to make a judgment about something else (like how wealthy the family is). This indicates that attentional selection is driven in part by internal goals and motivations, not simply by the attributes of the world around us.

What is the orienting reflex, who are the "kings" and "queens" of the relevant part of the neural jungle, and why?

The orienting reflex is the tendency to look at certain stimuli in the world around us. Things that stand out from their surroundings by being different or being unexpected will demand attention. Neurons in your brain that are sensitive to these things are the most likely to respond and will dominate where attention is directed.

Poser asked participants to keep their eyes fixed on a particular location. He then cued their attention to a different location on the screen. When the cue was correct (and the target actually appeared in that location) subjects responded more quickly. But when the cue was incorrect they responded more slowly. This shows that they were successfully directing their attention to the cued location even though their eyes hadn't moved there yet.

Four reasons are given in the book for positing inhibition as being important in attention. What are those four reasons, expressed in your own words?


1) If there were only excitation, brain activation would quickly build up and overload.

2) Inhibition allows the brain to rapidly re-set from one stimulus and get ready for the next item.

3) Inhibition is everywhere in the brain, so there's no reason to believe that attention is special and wouldn't include its effects.

4) There is direct evidence of inhibition in attention in that people inhibit themselves from returning their gaze to previously-viewed locations.

What is negative priming and why is it brought up in the book?

You have a tendency to be reluctant to look at a place where you previously learned not to look, even if later on you do need to look there. It is brought up to demonstrate how attentional processes have a prolonged effect on many aspects of our experience of the world.

Explain how the letters, SSSSS, TTSTT, and XXSXX on page 50 were used to provide evidence for inner conflict.

When asked to identify the central letter S, participants are slowest when that S is surrounded by TT's, which require a different responses; they are quicker when that S is surrounded by XX's, which require no response; they are quickest when that S is surrounded by SS's, which require the same response (pressing the "I-see-an-S" button). This demonstrates how excitation can speed (SSSSS) responding and how inhibition (TTSTT) can slow responding relative to neutral conditions (XXSXX)

Response competition was discussed on pages 50-51. What is it and what is its empirical basis?
When you do a task with two choices, both responses are prepared at the same time but only one is acted upon. We know this is true because electrodes in muscles show that muscles in both hands are active in preparation for a response.

What are the Stroop task and Stroop effect?

In the Stroop task participants see color words printed in different colors of ink. They are told to indicate the color of the word rather than the word itself (e.g., the word "yellow" printed in blue ink would have the answer "blue"). Sometimes the words are printed in the same color as their name ("blue" in blue ink) and other times they don't match. When the colors are mismatched, participants are slower than when they are matched. This demonstrates interference between the two types of information.

An experiment was described concerning the face region of the brain and the house region of the brain. How did the experiment work and what did it show?

Participants were shown pictures of faces and pictures of things similar to faces while in an MRI. By subtracting the signals between the two conditions, researchers highlighted areas of the brain that were uniquely activated by faces. A similar experiment found areas sensitive to houses. And when both were overlaid on top of one another, the activation matched with what participants were told to attend to. This showed that attention changes how we process objects in space, rather than simply locations.

In cognitive psychology and neuroscience, what is a "production"?

A production is an if-then rule for what response to make if a stimulus is presented. Humans can't simply absorb information, we need to do something about it.

The homunculus was referred to as an irksome bloke. What do these terms mean? What is the (related) Russian doll problem in cognitive psychology and neuroscience?

One theory of attention is the spotlight theory, where attention moves around from object to object highlighting it for processing. However, how does the brain direct that spotlight? This recurring problem of "how to select" is essentially invoking a homunculus, and recurs so often in psychology that it can be irksome to develop theories without it. Of course, the homunculus would need a way to select items, resulting in minds-within-minds like a Russian doll, which has a doll inside a doll inside a doll, etc.

What do the terms early selection and late selection mean? Which model is preferred in the book and why is it allowed that attention can operate at the other end as well?

Early selection and late selection refer to when an item is selected in attention. In late selection, a lot of processing happens before we can identify an item and conclude that it should be ignored. In early selection, to-be-ignored items are filtered out early without being extensively processed. The book expresses a preference for the late-selection and the idea that items are selected immediately before actions are made. However, when it comes to simply attending to stimuli, early selection is reasonable, though according to the book's author, such early selection is made more plausible if one thinks that different actions are taken depending on what the early selection does.

What are the rules for playing 20 questions? Why is this game relevant to the Hick-Hyman Law? Describe a possible choice RT experiment involving the identification of sounds where you vary aspects of sounds as well as the number of them to see whether the time to identify a particular sound increases in a manner consistent with the hypothesis that they are, in effect, playing 20 questions to come up with the answer.

For the game 20 questions the player tries to identify an item, object, person, or subject by asking a series of yes/no questions aimed at narrowing down the correct answer. A choice RT experiment using sounds could use the following design: participants respond differentially to four stimuli which vary with respect to volume (loud or soft) and pitch (high or low). Or there could be eight stimuli which vary with respect to volume (loud or soft), pitch (high or low), and timbre; the timbre of a trumpet is different from the timbre of a violin

Donders' results are interesting because they depend on what stimuli and responses are possible. Explain.

Donder's recorded reaction times (RTs) when people were in different mental states. When the task was simple, a single light turns on and the participant has one button to push. The RT is about 200 milliseconds (ms). When the task is more complex, either of two lights comes on and the participant much press a corresponding button. The RT increases by about 300 ms. But the longer RT in choice is obtained when the identical stimulus and response are presented as without choice (the simple case). So the same stimulus and response can be used measure the mental processes they require.

What does the Hick-Hyman Law say? What other possible result, raised in the book, does it contradict?

The Hick-Hyman law says that choice RT increases by a constant amount with each doubling of the number of stimulus-response alternatives. This goes against the hypothesis that choice RT increases by a constant amount with each increment-by-one of the number of stimulus-response (S-R) alternatives. So choice RT doesn't go up by a constant amount as the number of S-R alternatives increases from 2 to 3 to 4 to 5, etc. Rather, choice RT goes up by a constant amount as the number of S-R alternatives increases from 2 to 4 to 8 to 16, etc.

In the digital age, people might be referred to with strings of 0's and 1's. Make up another example using dimensions other than the ones used in the book. Explain what "bits" are in your own words, and explain why computers can represent information of different kinds with bits?

Bits represent logical, binary values (true/false, +/-, or on/off values). You can code an image with bits: At each pixel, is there black or white, for example? Using this system you can describe any piece of information with bits. Different types of information can be broken down into strings of bits, allowing for a universal language for all information

Considering the average number of words that a typical college student knows, calculate the rate at which all those words are checked in a lexical decision task if it takes 1.2 seconds to confirm that a letter string shown on a computer screen is a word. Express the rate in terms of number of words checked per millisecond (thousandth of a second). Assume that the time to see the letter string plus the time to press the response button is .2 seconds. The remaining 1 second is taken up with the inner search.

If the average college student knows 40,000 words and has to check all of them within 1.2 seconds: 1.2 seconds - 0.2 seconds; 40,000 words/1 sec = 40 words/ms for the search itself, not including the lever press and the visual input

Why were psychologists jubilant in the 1950's? But what made them frustrated when chunks came along? What are chunks versus bits?

Psychologists were jubilant in the 1950's when information theory was developed because it offered the bit as a unit of measurement that psychologists could use, helping to make it more of a traditional "hard" science. The introduction of chunks undermined this idea, however, because it demonstrated that the amount of information people can store for immediate recall is not based just on the number of bits but on on the number of meaningful units or chunks. For example: phone numbers are more than 7 digits, including the area code, yet if you chunk the digits into smaller groups like 555, 0123, 7735, you can recall 11 digits with 3 chunks.

What does the magical number 7 refer to? Why did George Miller use the term "magical"?
Seven refers to the average number of meaningful clusters of information people can store and immediately recall. Plus or minus 2 refers to the variation in this number among people. Some people may be able to recall 8 chunks (one more than the average of 7) or 6 chunks (one less than the average of 7), but just about everyone can recall between 5 and 9 chunks. Miller used the term "magical" because he didn't know why 7 was the average number of chunks that could be recalled

Explain what Sylvan Kornblum showed vis a vis information theory. Make sure you summarize information theory's account of choice RTs and then explain what Kornblum discovered instead. Finally express in your own words - not just "it's a jungle in there" -- what Kornblum's discovery reveals.

Kornblum showed that choice RTs were better predicted by the history of the S-R choices than by the number of S-R choices. Choice RTs increased for those S-R choices that had not been recently presented. The longer a SR alternative went untested, the longer the RT. Information theory suggested that RT would increase linearly with the number of choices in the experiment because more decisions would need to be made. Kornblum, instead, suggested that if a neural system had been used recently, corresponding to a SR choice, it would react more vigorously the next time it was presented. So neural systems can be primed to respond again after recent activation, but this stronger response, or shorter RT, can decay over time if the SR choice doesn't appear for a while.

What is the task that Saul Sternberg developed? How was it different from tasks that had been used before and what was it designed to explore?

He asked participants to memorize short lists with varying number of items and then to try to recognize items from the list when those items or distractor items were presented, one at a time, later on. During the recognition task, participants pressed a button if they were shown one of the letters they had memorized or to press another button if they were shown a different letter (not in the memorized list). Traditional studies had focused on whether information could be memorized, not the rate at which it is searched after being memorized.

Review what is said in the book about the relation between the number of searchers in memory and serial versus parallel search.

In a serial search model, there would be one searcher working their way down the list of items. However, in a parallel search where all items are scanned simultaneously, there can be as many searchers as the number of items being searched.

Suppose you have 100 books on your bookshelf and think you might have left a lottery ticket worth $1 million in one of them. You look through the books, one after the other, hunting for the lottery ticket, figuring you'll stop once you find the ticket. If it takes 1 second to look through a book, what would be the expected time to find the ticket if it's in one of the books, and how long would it take you to conclude that the ticket's not in one of the books? What is this method of search called?
100 books x 1 second/book = 100 seconds ~1.7 minutes. This is called serial self-termination, checking one item at a time in a row serially and stopping when the correct item is found.

What is serial exhaustive search? How does it differ from the other kind of serial search described in the book, both in terms of procedure and predictions?

The serial exhaustive search method involves searching through a list of items to identify the correct one. Even if the correct items is discovered, the search continues through the entire list until all have been checked. Then at the end, the response is decided if the item existed within the list. For this method, the search is exhaustive because every item is checked, even after the correct one is identified. For this method the RT for yes or no, identifying the item, is based purely on the size of the list, and both "yes" RTs and "no" RTs increase with the size of memorized list at the same rate.

Explain in your own words where 26.67 minutes came from. What point was being made?

This is calculated using the average number of words known by college students, 40,000 multiplied by the average speed a memory item is checked, as predicted by Sternberg (40 ms) yields an estimate of 26.67 minute, or how long it should take a student to identify whether a word is a real word if they use the exhaustive search method. The point is that it doesn't take this long to complete this task. Instead it takes around 1 second, so the serial exhaustive search can't be used to search one's full mental lexicon (dictionary).

Did Occam shave? What's really the point about his razor? Explain the context in which it was discussed in the book, but suggest other examples where Occam's razor might be useful. Hint: Think about UFO's, ghosts, and such

Occam said that among competing hypotheses, the one with the fewest assumptions is likely to be correct. In other words, the simplest explanation is best. The method we use to search our memories should, by this criterion, be the same regardless of whether the list being searched is a special list that has just been learned or the entire mental lexicon. Occam's razor can be applied to many situations. For example, instead of imaging that fanciful all-powerful demigods drag the sun across the sky, it is simpler to say that the earth rotates and creates the appearance the sun moving.

In your own words, what is limited-capacity parallel search?

According to this model, parallel searches are carried out more slowly as more and more items are being looked for.

In your own words, what are stimulus-response compatibility, natural mappings, and unnatural mappings?

Stimulus response compatibility refers to the similarity between a stimulus and its response. High compatibility (natural mappings, strong connections) is illustrated by a left side stimulus leading to a left-side response. Low compatibility (unnatural mappings, weak connections) is illustrated by a left side stimulus leading a right-side response.

What is the Simon effect and why does it reflect inner conflict?

The Simon effect refers to an experimental paradigm where you must push a button on the left when presented with a specific letter, say X, and you must push the button on the right when the Y appears. Data suggests that when the X appears on your left side the response is quicker than if the X appears to your right, and vice versa for the Y. This demonstrates inner conflict because a left side stimulus activates a left side response no matter what the actual stimulus is, be it X, Y or something else.

What is semantic priming? Suggest a possible example other than the one given in the book. Why is semantic priming a form of positive priming?

Semantic priming refers to the observation that a response to a stimulus (e.g., school) is faster when it is preceded by a semantically related stimulus (e.g., teacher) compared to an unrelated one (e.g., jury). This is positive priming because it helps speed the processing of the input, reducing RT.

The book has a fairly long section entitled "Why Are RTs So Long?" Without going through all the details of the argument, what are the three main points? The first concerns the problem. The second concerns the possible solution (hint: long strings). The third concerns something else.
The main issue is why are reaction times so long after the stimulus is presented and why are simple RTs different from choice RTs? The main issue is that the time required does not match the speed of nerve conduction. Possible solutions are either the time for RTs includes non-neuronal factors, say muscle contraction, or a lot of processing prior to that within the brain. These can be ruled out, as they have been measured and do not account for the time measured for RTs; however, when inhibition along with excitation within neural systems is considered, perhaps now the RT time can be explained. The time needed for any given response to be displayed depends on its own activation, but also on inhibiting its competitors and overcoming their inhibition as well.