aproximándose a la aculturación mediante medidas de composición y estructura de las redes...

Aproximándose a la aculturación mediante medidas de

composición y estructura de las redes personales

Chris McCartyUniversity of Florida

Jose Luis Molina y Miranda Lubbers Universitat Autonoma de Barcelona

National Science Foundation, Award No. BCS-0417429

Esta presentación tiene cuatro partes

1. Una visión de conjunto de la investigación en ciencias sociales

2. Una visión de conjunto de las redes sociales

3. Una introducción al análisis de redes sociocéntricas

4. Una introducción al análisis de redes personales

5. Una presentación de Egonet

• La mayor parte de la investigación en ciencias sociales está diseñada para predecir actitudes, conductas o condiciones de personas a partir de sus características.

• Los científicos sociales preguntan u observan características acerca de respondientes y utilizan esa variabilidad en esas características para explicar las variables dependientes

Ejemplo de un diseño de investigación

Un/a científico/a social puede recoger información sobre una muestra de 500 respondientes e intentar predecir su conducta fumadora utilizando la variabilidad en su edad, formación, ingresos, altura y peso

Edad

Formación

Ingresos

Altura

Peso

Número de cigarrillos que fuma cada día

Variables independientes Variable dependiente

Conclusión

El/la investigador/a concluye que la edad, el nivel de formación y los ingresos son buenos predictores de cuántos cigarrilos fuma cada día, mientras que el peso y la altura no lo son.

Edad

Formación

Ingresos

Peso

Altura

Número de cigarrilos que fuma cada día


Influencia social

• Los científicos sociales piensan que algunas variables dependientes están influenciadas por factores sociales.

• Por ejemplo, está comúnmente aceptado que la introducción al hábito de fumar entre adolescentes es debido a la influencia de los colegas.

• Dado que la influencia de los colegas no se puede observar de formar directa, los científicos socials diseñan preguntas que pueden ser utilizadas como aproximaciones para observar la influencia de los colegas.

Preguntas “proxy” (indicadoras)

• ¿Fuman tus padres? (Padres)

• ¿Fuma la mayor parte de tus amigos? (Amigos)

• ¿Alguno de tus amigos te ha ofrecido alguna vez un cigarrillo? (Ofrecimiento)

Poder predictivo de la influencia social

Los investigadores/as han descubierto que estas preguntas aproximativas explican parte de una varianza previamente inexplicada en relación al hábito de fumar por la edad, la formación y los ingresos.

Edad

Formación

Ingresos

Padres

Amigos

Ofrecimiento

Número de cigarrilos que fuma cada día


Preguntas

• ¿Conocer más detalles sobre la influencia social que rodea a un respondiente nos proporciona más poder explicativo?

• ¿Qué preguntas podemos hacer para conseguir esta clase de detalle?

• Proponemos utilizar la aproximación de las redes sociales.

Dos tipos de análisis de redes sociales

Análisis sociocéntrico (redes “completas”)

• Se centra en la interacción dentro del grupo

• Recoge información de los miembros de un grupos acerca de sus relaciones con el resto de miembros

Análisis de redes personales

• Se centra en los efectos de la red en las actitudes individuales, conductas y condiciones

• Recoge información sobre el respondiente (ego) acerca de sus interacciones con los miembros de la red (alteri)

Aproximación sociocéntrica al hábito de fumar y la influencia social

• Selección de un grupo de estudiantes dentro de una clase.

• Preguntar a cada estudiante que puntúe en una escala de 0 a 5 cuánto se socializa con cada uno de los otros.

• Preguntar a los estudiantes si fuman o no.

Matriz de adyacencia de estudiantes

• Las evaluaciones realizadas por cada persona pueden ser usadas para construir una matriz que represente las relaciones entre los miembros de la clase• Las celdas que interseccionan dos personas representan la valoración realizada• David dice que se socializa con Faith a un nivel 2• Faith dice que se socializa con David a un nivel 1

David Faith Rosanna Antonio Napp Lem Jim Beth Mark Kent Amber ThomasDavid 5 2 2 0 0 1 0 3 1 0 2 0Faith 1 5 5 0 0 0 0 1 0 0 2 0

Rosanna 2 5 5 0 0 1 0 2 0 0 4 0Antonio 0 1 1 5 0 0 0 0 0 0 0 0

Napp 0 0 0 0 5 0 0 0 0 0 0 0Lem 2 0 2 0 0 5 5 2 0 0 2 0Jim 0 0 1 0 0 5 5 5 0 0 2 0

Beth 4 3 1 0 0 1 5 5 0 0 3 0Mark 1 0 0 1 0 0 0 0 5 0 1 0Kent 0 0 0 0 0 0 0 0 0 5 0 3

Amber 2 3 3 0 0 1 2 2 1 0 5 0Thomas 0 0 0 0 0 0 0 0 0 3 0 5

Visualización de la red

• Podemos usar la matriz para visualizar la estructura de relaciones

• Hay un gran grupo en el centro.

• Amber y Beth fuman

• Napp no se socializa con nadie y Thomas y Kent sólo se socializan entre ellos dos

Visualización de la red

• Podemos calcular algunas medidas de esta estructura

• Hay dos componentes de la red

• Beth es la que tiene un grado nodal más alto

• Amber es la que tiene un grado de intermediación más alto

Conclusión

• Podemos concluir que aquéllos que forman parte del grupo de Beth y Amber es más probable que fumen.

• Napp, Kent y Thomas no.

• Este análisis no dice nada de las influencias fuera del grupo.

• Para estudiar las influencias a través de grupos utilizamos análisis de redes personales.

Tom tiene una Red Personal de 10 personas

Tom se encuentra con esas personas en tres grupos

FAMILY

WORK

CLUB

Dentro de cada grupo se conocen todos

CLUB

WORKFAMILY

Hay también algunas relaciones entre grupos

FAMILY

WORK

CLUB

A veces las redes personales pueden ser complejas

Introducción a la recolección de datos de redes personales

1. Identificar una población

2. Seleccionar una muestra de respondientes

3. Preguntas sobre cada respondiente

4. Obtener miembros de la red personal

5. Preguntar sobre cada miembro de la red personal

6. Pedir que se evalúe la relación entre los miembros de la red personal

Identificar una población

• El análisis de las redes pesonales empieza de forma muy parecida a cualquier investigación en ciencias sociales.

• La primera cosa a hacer es claramente identificar la población de interés.

Seleccionar una muestra de respondientes

• La recolección de datos sobre redes personales puede ser un largo proceso de entrevista que a veces requiere un programa informático especial.

• Esto puede significar un compromiso entre la representatividad de la muestra y el nivel de detalles acerca de sus redes personales.

Preguntas sobre ego

• Queremos saber algunas cosas sobre el respondiente (ego)

– Queremos saber sobre las variables dependientes de interés.

– Queremos saber también acerca de otras variables explicativas que no están relacionadas con la influencia social.

Obtener miembros de la red personal

• Aquí es donde la recolección de datos de redes personales difiere del resto de investigaciones de ciencias sociales.

• Preguntamos a ego un conjunto de cuestiones (generadors de nombres) que permiten obtener nombres de personas que conocen (alteri):

– Lista-libre de gente con la que han obtenido contacto en el pasado año

– Gente con la que discuten temas importantes– Gente con lo que hablaron la semana pasada– Gente con nombres de pila específicos

• Esto define la red

Preguntas sobre cada alter

• Preguntar a ego sobre cada alter

• Normalmente ésta es la parte más larga de la entrevista

• Si cada respondiente genera 50 alteriy se quieren conocer 10 cosas sobre cada uno, entonces tenemos que hacer 500 preguntas.

• Se debe buscar un equilibrio entre el número de alters y la cantidad de información que se quiere obtener de cada uno.

Pedir que evalúen la relación entre cada par de alteri

• Finalmente, queremos recoger datos estructurales para formar una matriz de adyacencia.

• Esto significa que ego debe evaluar todos los lazos posibles entre cada par de alteri.

• Afortunadamente, usualmente asumimos que los lazos son simétricos, lo cual significa que solamente tenemos que conocer si los dos alteri están relacionados.

• El número de lazos a evaluar crece geométricamente a medida que se añaden nuevos alteri.

Carga de trabajo a medida que se añaden alteri

Respondent burden by number of alters

0

200

400

600

800

1000

1200

1400

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

Alters

Alt

er p

air

eval

uat

ion

s

Cuatro posibles soluciones para reducir la carga del respondiente

1. Preguntar menos alteri

2. Preguntar muchos alteri y después seleccionar una muestra de alteri seleccionada de forma aleatoria para establecer las relaciones entre sí.

3. Preguntar muchos alteri y después seleccionar un subcojunto de relaciones

4. Intentar predecir un lazo a partir de la noción de transitividad

Conclusiones

• Para la mayor parte de medidas estructurales, un subcojunto aleatorio de 20 alteri proporcionará similares resultados que un listado de 45 alteri.

What kind of data do we get?

• Data on network composition. These are summaries of the attributes of network alters.

• Data on network structure. These are summary measures of the pattern of relations

• Combinations of composition and structure

Personal network compositionName Closeness Relation Sex Age Race Where Live Year_Met

Joydip_K 5 14 1 25 1 1 1994

Shikha_K 4 12 0 34 1 1 2001

Candice_A 5 2 0 24 3 2 1990

Brian_N 2 3 1 23 3 2 2001

Barbara_A 3 3 0 42 3 1 1991

Matthew_A 2 3 1 20 3 2 1991

Kavita_G 2 3 0 22 1 3 1991

Ketki_G 3 3 0 54 1 1 1991

Kiran_G 1 3 1 23 1 1 1991

Kristin_K 4 2 0 24 3 1 1986

Keith_K 2 3 1 26 3 1 1995

Gail_C 4 3 0 33 3 1 1992

Allison_C 3 3 0 19 3 1 1992

Vicki_K 1 3 0 34 3 1 2002

Neha_G 4 2 0 24 1 2 1990

. . . . . . . .

. . . . . . . .

. . . . . . . .

This ego has told us some things about each alter. For example, Joydip is a 25 year old male she met in 1994 that she is very close to.

And we can add these to our modelAge

Education

Income

Altage

Altsmoke

Duration

Number of cigarettes smoked per day

Independent variables Dependent variable

For each respondent these now become variables about their social environment that can be used to predict outcome variables. In this case we may believe that higher proportions of smoking alters leads to smoking.

Now we can create a set of compositional variables

• Average age of each alter (ALTAGE)• Proportion of alters that are women

(ALTWOMEN)• Proportion of alters that are family

(ALTFAMILY)• Average length of time ego has known

each alter (DURATION)• Proportion of alters that smoke

(ALTSMOKE)

Personal Network Structure Joydip_K Shikha_K Candice_A Brian_N Barbara_A Matthew_A Kavita_G Ketki_G . . .

Joydip_K 1 1 1 1 0 0 0 0 . . .

Shikha_K 1 1 0 0 0 0 0 0 . . .

Candice_A 1 0 1 1 1 1 1 1 . . .

Brian_N 1 0 1 1 1 1 1 1 . . .

Barbara_A 0 0 1 1 1 1 0 0 . . .

Matthew_A 0 0 1 1 1 1 1 1 . . .

Kavita_G 0 0 1 1 0 1 1 1 . . .

Ketki_G 0 0 1 1 0 1 1 1 . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

The same Ego also evaluated the ties between their alters. We end up with an adjacency matrix for each ego. We can use this to calculate structural measures.

Now we can create a set of structural variables

• Number of components (COMP)

• Average betweenness centrality (BETWEEN)

• Closeness centralization (CLOSCENT)

• Number of alters in network core (CORESIZE)

And these can be added to the modelAge

Education

Income

Altage

Altsmoke

Duration

Comp

Between

Coresize

Number of cigarettes smoked per day

Independent variables

Dependent variable

In this model we want to test whether the structure of the personal network impacts smoking. For example, betweenness centrality is a measure of bridging. Bridging represents exposure to different groups which may not tolerate smoking.

Some measures of personal network structure• Degree Centrality – An alter is highly degree-central to the extent he or she is directly

connected to many other alters.

• Closeness Centrality –An alter is highly close-central if he or she is connected by short paths to many other alters.

• Betweenness Centrality –An alter is highly between-central to the extent he or she lies on many geodesics (shortest paths) between alters.

• Components – A set of alters who are connected to one another directly or indirectly.

• Isolates – A node unconnected to any other node.

• Network-Degree Centralization – A measure of the extent to which the network is dominated by a single alter using degree centrality.

• Network-Closeness Centralization – A measure of the extent to which the network is dominated by a single alter using closeness centrality.

• Network-Betweenness Centralization – A measure of the extent to which the network is dominated by a single alter using betweenness centrality.

How do we collect and analyze these data?

• Many researchers develop paper instruments or computerized instruments that let them collect these data

• Compositional data are calculated using a statistical package (e.g. SAS or SPSS)

• Structural analyses are not typical and are often limited to personal network density, since it is an easy measure to program

Need for personal network software

• A standardized software package would offer many advantages

• It provides a computer interface that edits and standardized data input with complex skip patterns

• It automatically can calculate compositional and structural measures and export them to a data set compatible with a statistical package

• It makes it possible to analyze individual cases

EgoNetPersonal Network Analysis Software

Available at www.mdlogix.com

http://www.mdlogix.com/

Egonet design

• Egonet is written in Delphi and runs on a Windows platform

• There are two programs:1. Administrator program to create a study and

assemble a questionnaire

2. Client program to collect data and analyze it

Example data file from Egonet

11118.40.29319M7

19262.70.584.265F6

14121.50.722.322F5

12143.20.343.541M4

23224.80.524.124F3

7145.30.653.456F2

13334.60.232.635M1

CliquesComponentsAverage alter age

Proportion of females

Average tie strength

AgeSexID

11118.40.29319M7

19262.70.584.265F6

14121.50.722.322F5

12143.20.343.541M4

23224.80.524.124F3

7145.30.653.456F2

13334.60.232.635M1

CliquesComponentsAverage alter age

Proportion of females

Average tie strength

AgeSexID

Egonet outputs data across all the respondents and assembles it into one file. Notice that the data set has data about ego (sex, age), compositional data (Proportion of females, average alter age), and structural data (components, cliques). This data set would be difficult to produce without this software.

Egonet can also visualize the personal network of a single Ego

This is the personal network of Merced, a 19-year-old second generation West African migrant in Spain. The dots represent her alters and the lines represent a connection between alters based on her evaluation of the ties.

We can label the dots (nodes) with information we collected from Merced about

each alter, like where they are from

We can also size the nodes, in this case by Merced’s assessment of how close she is to each alter

And we can color the nodes, in this case by race

Finally, we can shape the nodes, in this case by whether they smoke (smokers are the squares)

We now have a picture we can use to interview Merced about her acculturation experience in Spain. See the potential influence of white, Spanish smokers in the upper right from her high school

Contrast this with the visualization of her 22 year old sister Laura, labeled, sized, colored and shaped with the same variables. Their acculturation experiences are different.

This is Vivian, a 36 year old Moroccan woman

And this is Jose, a 46 year old Dominican man

We can also use Egonet to visualize structural measures. Here is Merced’s network with nodes colored by betweenness centrality.

Here Merced’s network is colored by her relation type (blue nodes are relatives). Egonet has done a cluster analysis and circled nodes and labeled them with numbers.

Samples of first and second generation immigrants

Location Group Interviews

Barcelona Argentine 81

Barcelona Moroccan 70

Barcelona Dominican 64

Barcelona Gambian 26

Barcelona Equatorial Guinean 9

Barcelona Senegalese 43

New York City Puerto Rican 86

New York City Dominican 97

New York City Columbian 34

Miami Cuban 12

Miami Haitian 13

Kansas Mexican 13

Total 548

Procedure

• Respondents answered a set of questions about themselves, including an acculturation scale

• Respondents free-listed 45 alters given the following definition:

“You know them and they know you by sight or by name. You have had some contact with them in the past two years, either face-to-face, by phone, mail or e-mail, and you could still contact them if you had to.”

Procedure (continued)

• Respondents answered twelve questions about each alter

• Respondents evaluated all 990 possible ties between alters rating the probability that the alters talk to each other independently of the respondent

• Structural variables were calculated using ties that the respondent was sure existed

• We conducted a qualitative interview with each respondent using a visualization of their network

Data Cleaning

• In some cases we questioned the authenticity of the data

• We viewed each of the 486 visualizations and listened to the interviews.

• These were scaled on a scale of 0-5 reflecting our assessment of their authenticity

Percent Distribution of Authenticity Scores

05

101520253035404550

Not at allauthentic

Veryauthentic

*There are 369 cases (so far) with authenticity of 4 or 5

Dependent Variables

• Health – 42% excellent, 42% good, 14% fair, 2% poor

• Smoking - 76% smoke, 24% don’t smoke

• Depression – 66% not depressed, 23% mild depression, 11% depressed

• Children – 51% no children, 19% one child, 13% two children

Independent variablesRespondent characteristics

Variable Proportion

HOST COUNTRY (Spain) .68

COUNTRY ORIGIN (Dominican) .30

SEX (Male) .54

GENERATION (First) .92

AGE (Less than 30 years old) .51

SKIN COLOR (White) .30

MARITAL STATUS (Never married) .52

LEGAL (Yes) .65

EMPLOYMENT (No) .30

EDUCATION (Secondary school) .49

ACCULTURATION (Level 1) .40

Independent VariablesCompositional Characteristics Mean SD % strong ties .42 .23 % men .53 .17 % see every week or more .47 .22 % living in host country .43 .33 % born in host country .19 .20 % family .36 .21 % above 50 years old .14 .13 % that are White .49 .32 % can talk to about problems .41 .21 % never smoked .36 .27

Independent VariablesStructural Characteristics

Mean SD Average degree centrality 31 19 Average closeness centrality 141 170 Average betweenness centrality 1.7 .9 Number of components 1.37 .87 Number of isolates 2.06 4.6 Number of alters in network core 13.04 8.38

Correlation between Dependent Variables and Respondent Characteristics

Health Smoking Depression Children HOST COUNTRY (Spain) .001 .020 .015 -.069 COUNTRY ORIGIN (Dominican)

.094 -.143 -.044 .135

SEX (Male) .150 -.121 .027 .110 GENERATION (First) .058 -.149 -.098 .171 AGE (Less than 30 years old)

.243 -.049 .032 .542

SKIN COLOR (White) -.035 .072 -.087 -.067 MARITAL STATUS (Never married)

-.114 .132 .058 -.476

LEGAL (Yes) .050 .018 -.033 -.127 EMPLOYMENT (No) -.007 .044 -.090 -.104 EDUCATION (Secondary school)

.152 .001 .115 .258

ACCULTURATION (Level 1)

-.048 .027 .026 -.211

Correlation between Dependent Variables and Compositional Characteristics

Health Smoking Depression Children % strong ties -.004 -.033 -.019 .097 % men -.077 .102 -.043 -.120 % see every week or more

-.076 -.007 -.101 -.102

% living in host country

.018 .041 .050 -.072

% born in host country -.055 .062 .025 -.108 % family .149 -.110 .013 .261 % above 50 years old .162 -.076 -.033 .353 % that are White -.035 .072 -.087 -.067 % can talk to about problems

-.055 .174 .065 -.050

% never smoked -.053 .412 .104 -.242

Correlation between Dependent Variables and Structural Characteristics

Health Smoking Depression Children Average degree centrality

.019 -.135 -.013 .190

Average closeness centrality

-.042 .030 .001 -.088

Average betweenness centrality

.068 .125 .040 -.081

Number of components

-.104 .066 .027 -.128

Number of isolates -.041 -.028 .006 -.087 Number of alters in network core

.015 -.089 .014 .226

aproximándose a la aculturación mediante medidas de composición y estructura de las redes...

Documents