Joseph Nathan Cohen

Department of Sociology, CUNY Queens College, New York, NY

Laura Nelson: What Does History Remember?

Original Video Description

This workshop discusses Laura Nelson’s (Northeastern) work in development, which applies computational methods to historical records to get a sense of what gets remembered. Today’s discusses focuses on what is remembered from the history of the women’s rights movement.

Transcription (Auto-Generated)

all right and okay all right i think we’re uh live now uh all right charlie do you want to uh introduce us get us going yeah hi everyone i’m charlie gomez uh my co-hosts here are joe cohen in hongwei zhu and we have laura nelson from northeastern university and she will be presenting her uh work in progress uh entitled and the rest is history and i was actually just commenting what a great title that is for someone who can’t come up with titles to save themselves on any paper so i that is uh i i really appreciate that it’s a rare skill um i’m gonna so uh laura is an assistant professor of sociology at northeastern university uh where she’s part of the core faculty at the nu lab for text maps and networks she’s also affiliated at the network science institute at northeastern and is on the executive committee for women’s gender and sexuality studies uh prior to northeastern she was a postdoctoral fellow at the berkeley institute for data science and digital humanities um and for and at the management organizations department at northwestern university where she was also affiliated with the nico the northwestern institute on complex systems so she uses a computational tools primarily text analysis to study social movements culture gender institutions and organizations her work can be found in mobilization gender and society sociological methods and research sociological methodology as well as the oxford university press among other outlets and she gives many talks and does many workshops on computational social science here and abroad um and we are really excited to hear uh your new work and um this work is sort of um it’s a work in progress so we’re also eager to kind of workshop some of the ideas that that you have around there so everyone you know uh we are a very friendly bunch we we like to kind of uh get get get our hands dirty so um we were definitely excited to give her to give laura some great feedback on on her work in progress great thank you thank you for that introduction thank you for having me i’m i’m really excited about this let me share oh i don’t have permissions oh i do have permissions to share my screen i’ll be there in about 15 minutes sweetie um all right so i want to talk today about history unsurprisingly uh so most of you probably recognize Rosa Parks this picture and this person this is of course rosa parks who is well known for kind of kick-starting the modern civil rights movement when she refused to give up her seat on the bus to a white man um and the bus the montgomery street the montgomery bus boycott followed which was the initial event that we they commonly say that started the civil rights movement so this picture of course is not a a real picture of the event it’s a stage picture it was staged and taken after the event in order to commemorate the event which was a common occurrence during this era that there’s an interesting story of that white guy sitting behind her which is fun to look up if you’re interested but this is a picture that i think a lot of people would recognize and perhaps a lot of people would think that this was the actual event but it’s it’s not it’s a staged event and this motivates kind of my interest in history and how we remember history and how we think about history um i think it’s a good example of kind of how history is made and um remembered by people so most of you probably recognize rosa parks some of you may recognize this woman but maybe not all of you this is racy taylor so rosa parks actually got her start in civil rights organizing in the anti-sexual harassment movement so she did a lot of work on the harassment and rape cases of black women in the south and racy taylor was one of her big early cases she was sexually assaulted pretty gruesomely by a group of white men and rosa parks took up her case and that got her really well known in the area for her work there there’s books written about her now but books about racy taylor now but she’s not as well known as rosa parks in the role in the civil rights movement um and most of you probably don’t recognize that third picture claudette colvin so she was a 15 year old girl and did the same thing rosa parks did a few years earlier she refused to give up her seat on the bus and of course we don’t commemorate her the way we do rosa parks and there was hundreds if not thousands of other people who did the exact same act that rosa parks did got arrested for it and haven’t been kind of enshrined in our historical memory the way that rosa parks had so this is generally how history works some events are simply more important to history some events get enshrined in history some events get in events people and ideas get enshrined in our historical or history books and some don’t some some are kind of lost to the historical archives or dustbin of history if we want to go marcus here so my question today and this What gets written into history books kind of motivating question for this new project i’m working on is what gets written into the history books and as charles was saying this is very preliminary i’m i’ve been wanting to do this project for years so i’m excited that i have the the time and opportunity to do it now so i’m basically going to present a bunch of descriptive stuff descriptive statistics that i um have and some descriptive text stuff and i’m hoping to really just brainstorm with you guys about the next steps what’s interesting to you and what’s not what what i could possibly do going forward to kind of shape this into a more systematic not systematic but into a project that makes sense for sociology the sociology of knowledge and our understanding of women’s movements over time and how they they get remembered so that that’s where i am and i’m very happy to be interrupted if something sticks out to you that’s really what i want to get here is if you see something and you’re like wow that’s fascinating i want to know that because that’s where that’s kind of what will shape how i go forward me and um i’m working with two graduate students on this so they’re very eager to hear what you think of this here all righty so the way we get history How we get history today is probably not through history books if we’re thinking outside of the school format it’s probably google so if you’re interested in learning the history or you’re reminding yourself about a historical thing that you want to put in a paper you probably put it into google or another search engine and more recently what we’ve been seeing is wikipedia tends to be the first hit so wikipedia is the first hit here and it gives us some key facts rosa parks was in the civil rights movement she’s known for the montgomery bus boycott her occupation as a civil rights activist etc and even more so and this is i don’t know past maybe a year or two google has been actually putting like a pull-out box on the right side of your search screen that pulls information from wikipedia and kind of puts it there as the authoritative source on the subject that you’re searching for this doesn’t work for everything you search for but a lot of time that’s what you get and not only does it just kind of pull out the images and the screenshot of it you know you start to see some things oh she’s the first lady of civil rights the mother of the freedom movement so if you’re searching for history this is what you see right away on google you see wikipedia and then you see this pull out of wikipedia with all of these details on it so not surprisingly i am very interested in wikipedia what What is Wikipedia doing is wikipedia doing how is it conveying these historical movements if people go to wikipedia to find information about history what are they seeing and does it match up with the way history was actually created was it doesn’t match up with what the people who were involved in creating those movements and moments were thinking is important what they were fighting for how does that translation process happen and you know the bigger questions are how does that impact the way we understand history think about history etc so there’s a lot of really interesting questions i think that we can dig into with wikipedia and wikipedia data they’re great they’re open with their data they release it all so it’s a really fantastic resource for a lot of computational social science work um coming out recently and for for about 10 years now and the other thing is i have a clarifying question so um i would imagine rosa parks obviously probably has a very extensive um and uh very very well cited uh wikipedia entry i’m also wondering about the other two individuals right so these other figures who were involved um and sort of like you know in like in montgomery um you know because like obviously you know rosa parks was already sort of a prom some semi-prominent figure but these other figures who came before her um i’m assuming that they don’t necessarily have a wikipedia page yeah claudette copeland does not racy taylor might now at this point because that book was written on her recently um but that’s exactly what i’m interested in that is precisely the question i’m interested in is yes rosa parks is an obvious one right she’s going to have her own wikipedia page claudette coleman doesn’t raci taylor might and that’s precisely what i want to measure is what ideas and organ organizations and people get their own wikipedia page who doesn’t and and what is that that means so that’s precisely why i’m going to this wikipedia page and you know the problem with a lot of social movement research and specifically historical research is the survivor bias right and the selecting on the dependent variable we know what survived because it’s in the history books and we read about it we don’t know as much about what didn’t survive so when when you’re looking at why was rosa parks popular you don’t know about all of the other rosa parks that weren’t right so it becomes very tricky and doug mcadam has some good work on selecting on the dependent variable so so one of the things that we can do with these new methods and new sources of data although it’s not perfect and not super easy but we can try to get around the survivor bias and say yes it’s great to look at the ideas that survived we also want to find those ideas people etc that did not survive they did not persist so that’s the id that’s the motivation here is that wikipedia is super extensive there’s six billion articles in english on wikipedia i mean it’s probably the most extensive historical not historical just resource for anything that you would want to look up and it’s really really highly trafficked like it’s the second most traffic site in the united states second only to youtube which actually surprised me that youtube was more trafficked than wikipedia but over a trillion people click on wikipedia every month so it’s really important and it’s also extensive so this is like the best case scenario if something is going to be recorded it’s probably going to be recorded in wikipedia so if you know racy taylor has a page there because it’s so extensive and so many people are working on it she probably doesn’t get a page in a historical textbook on the movement for example so yeah best case scenario wikipedia if it’s out there if it’s being recorded it will be recorded in wikipedia but as i will show some stuff don’t make it into wikipedia and i think that’s that’s important and that’s what i’m interested in yeah um i’m only looking at english language stuff here but there’s a lot of opportunity to do cross-language analyses as well for people who are ambitious and looking at cross-language stuff all right so we have wikipedia The motivating question the motivating question then for me is what ideas people and organizations from the early women’s movement are present in contemporary wikipedia so fairly simple question what is there what is not there and also what metadata is correlated with the presence or absence of the idea of person or organization so really trying to dig into what what explains it what are the patterns between what is there and what is not there to really think about why some things persist and get kind of enshrined in our historical memory and what doesn’t so that’s the two broad motivating questions here for me so the data for the women’s movement side which is The data tricky to get in any systematic way as anyone who’s done historical research knows comes from the alexander street collection called women and social movements in the united states 1600 to 2000 so this is a collection that consists it’s an online collection so it’s digitized which is great for me because i do computational stuff it consists of about 120 000 pages of primary source collections and there are of course important features of this collection that impact the interpretation of the analysis so the collections and the sources were selected by the founding editors of this collection for their relevance to this specific collection and in particular they wanted to focus on the diversity of the movement so they’re purposefully collecting documents so already we’re throwing out a bunch of ideas that existed a bunch of stuff that existed in the women’s movement in that first selection process so they went and they were trying to be extensive it’s an online database so it’s extensible meaning they can add as much as they want they’re not constrained by the length of the textbook for example so there’s a lot of resources there but it’s highly curated and highly selected so we already are starting off with a selection bias going into it but we gotta start somewhere so this is it sounds like it sounds like the stuff is also ocr too so like this also makes it that’s great yeah yeah it’s it’s not even just ocr it’s it’s um corrected ocr oh nice so it’s a hundred percent i mean more or less a hundred percent accurate which is a huge benefit i have done projects on ocr techs and it’s a nightmare so that’s another benefit of this collection is it’s it’s corrected because it’s on it’s online you can look at the actual text online it’s not just the images it’s the actual text online you can copy and paste from it so it’s corrected yeah so it’s a really nice resource resource um and thankfully my university subscribes to it so i was able to purchase the actual underlying data behind it which is fantastic so this is um Documents a description of the documents that were actually available to me via the i so i purchased an xml file that contain all the documents so it’s kind of a little bit of a hodgepodge of curated documents from a variety of women’s movement stuff here and it’s you know the number of documents is quite different so i’m focusing on the primary source documents so the ones that were written by people who were involved in the movement so writings of black women suffragists etc that’s the kind of selection so we go from the selection of which documents are included in this collection to the which documents i can actually get in the full text readable format which i think is everything that they they have on the website and then selected further to the primary source documents the writings by the actual women so that gets 120 000 documents gets boiled down to just over a thousand documents um i also did a little bit of curating here to narrow the time frame just for this initial analysis so just looking at 1898 to 1954 so sorry joe where i’m not going to go to four centuries of women’s movement activists but trying to get some comparable analysis going so writing to black women suffragist and i narrowed it to 1898 to 1954 i do have earlier writings from that collection that i could use the equal rights journal is the journal of the national woman’s party which is a largely white largely middle to upper class in some ways quite racist women’s organization and so they you know champion the equal rights amendment etc and then the national consumers league is a working-class women’s organization so they focus on the rights of working-class women trying to think through how that we can we can ease burdens on working-class women and and provide more rights for them so they were often in direct um competition with the equal rights the national women’s party so directly competing like fighting over bills and such so these it’s three very different collections different perspectives on the same um movement here so that’s that’s my collection the first step then when we have the Ideas women’s movement data is identifying ideas and this computationally is not trivial it’s actually quite difficult so just think for a moment with me about the steps it would take to identify ideas in a text like think about reading through a text how would you say this is an idea how would you bound the idea in the text and say this is a discrete idea i’m not talking about themes so we know we can use things like topic models to pull out themes we can use word embedding models to pull out the ways in which words are used and the relational distance between words we have all of that but i want discrete ideas i want this is an idea that we can look to see that was present in the women’s movement and may or may not be present in wikipedia in the history books as i’m measuring it so it’s a tricky problem and i think we’re doing better as a community at identifying this but this initially when i started doing computational methods this is what i wanted to do and then very quickly i was like wow this is nearly impossible i’m not going to do this but i’m back to it thinking through how we can identify these discrete ideas in the text and the best way that Key phrase extraction i know of is through key phrase extraction which is a very common and by now actually quite good way of extracting phrases from text so i used the rake algorithm which is one of the older key phrase extraction algorithms so it stands for rapid automatic keyword extraction it uses a list of stop words it uses phrase delimiters i did not make this slide i borrowed this slide um and i’m going to kind of hand wave over the the math here which i don’t think is particularly important here but the idea is it texts it detects the most relevant words or phrases in a piece of text so it’s not the most frequent it is the most quote unquote relevant and some phrase extraction algorithms use wikipedia to identify phrases so they keep they take phrases from wikipedia as a way to kind of train the algorithm to identify these common phrases i of course don’t want to do that right i don’t want to have wikipedia as both on both sides of the equation so i that’s why i didn’t use some of the more kind of sophisticated new algorithms to do this because i do not want to condition the phrases that i extract from the original text based on whether they appeared in wikipedia it would just you know throw off exactly what i’m trying to do so going to the go to trusty early rake algorithm to identify these phrases and lark ask a quick clarifying question on rake so is it the case that um is it does it sort of have like a preset bag of words that were like that sort of pre-identified phrases or is it sort of unique to each corpus so for instance like like i know it’s not necessarily like like a tf id app or anything like that um like it’s coming into it i guess it sounds like based on wikipedia um like these are sort of like very common idea or not ideas but phrases that have already been identified in sort of this wider body of knowledge um vis-a-vis wikipedia is that right not rake so other phrase extraction algorithms do that okay not right though okay not rake rake looks at stop words and looks at other ways of bounding phrases and you as the user you input the number of words you want to look for so i did everything from unigrams up to i think five word phrases so you can do you can distinguish how many or you can specify what types of n-grams you want so yeah i did one one to five and then it uses algorithm a system of rules based on stop words based on some other kind of components of phrases it does not unlike these other phrase extraction algorithms it does not use a list of um phrases from wikipedia so like it doesn’t use wikipedia titles for example too oh that’s super neat yeah so in another project i use a different phrase extraction algorithm that is conditioned on wikipedia but i i just can’t for obvious reasons can’t write that yeah yeah yeah i can i don’t want to do that here right yeah so this is really just stop words other other praised and letters i think there’s some tfidf stuff going on as well actually no it’s not because it’s individual text it only looks at individual text so there is no no tf idf stuff going okay it’s just stopped got it um but what it does is yeah it takes so it goes text by text so it’s single text by single text and this is two two examples of text here with the paragraph separation there it would take that as input and then it extracts these phrases so united states rights of women teachers is what i would think of as an idea in the women’s movement focusing on the rights of women teachers or women teachers as a phrase opposed legislation equal pay measure i’m in you know city of boston so it identifies places new york uh boards of education teachers equal pay bill um it identifies names like fairfax brown miss mary cromwell uh locations and auburn michigan etc so it’s a wide variety of things and some of them in my eyes represent discrete ideas concepts people that the women’s movement was proposing and some of them are pretty generic like city of boston and new york so it does have to be filtered and we filtered it by hand so this method extracted 23 000 ish phrases and me and my trusty grad students went by and just filtered it which is relevant or not relevant or relevant but may need to be corrected because there was some corrections that needed or not which sounds like a lot but that is a human sized problem we could do it in about five hours of work each which to me is not much that does pose the problem of scalability of course so if i decided oh i want to add more text here or i want a different corpus to look at you have to go through that process again so i think there is still a lot of work i mean i can’t think of a computational way to solve that um you know separating out the city of boston from teachers equal pay bill it’s that’s really quite tricky to do perhaps you could use wikipedia for that but it gets a little bit tricky so it’s done by hand and in general when i use computational methods i find that filtering lists is very quick and very effective so my preferred approach is to use computational methods to pull out relevant lists of words so in one project i look at verbs and verb phrases to capture tactics and then just filter the verbs and verb phrases for tactical tactic or not i find it’s a really great way to identify kind of list of words and gary king and some other psychologists have that same philosophy we’re really terrible at coming up with words in a category if i were to say name all the animals you would name like five and then you would panic and you’d be like i don’t know any animals but if i gave you a list and said pull out all the animals you would do it very effectively and very quickly so for people who are venturing into these types of computational projects filtering lists is great making lists is terrible so computational methods to make lists humans can filter lists so after filtering the phrase list the types of phrases we’re focusing on are actors idb wells people organization like the national women’s party and kind of groups of actors american feminists movement structures and events so things that make movements go and what they do on kind of a day-to-day basis or who is doing that stuff so city council women’s committees seneca falls conventions these are not strict codes we didn’t like attach a code to each one of these phrases but just generally the kind of framework we were using that we found emerged from these phrases to help us classify whether they were relevant or not constituencies working women married teachers french women so like groups of people that the movement was focusing on grievances and targets race prejudice maternal death canneries was a a frequent target of the early women’s movement um thinking about workers rights and canneries example for example and then then ideas so specific ideas like accident accident compensation specific acts and bills like the shepherd towner act or more kind of generic ideas like social uplift which was a very important concept in the black women’s movement in this period or equal rights which was an important concept for the white women’s movement in this period so things like that and then specific public sphere institutions which is a bit of a tricky category but the ame church was very important to the women’s movement and then universities were very important so you know who knows if they were relevant to the movement or not but they were at least mentioned in the in the women’s event documents here so that’s the range of types of ideas that come through through this algorithm and we ended up with about 5700 unique phrases and these phrases occurred about 15 000 times in the corpus so they were used on average about three times each in the corpus so now we’ve gone from corpus to 23 000 phrases to 5700 phrases that we thought were relevant to the the women’s movement here all right now comes the descriptive statistics Descriptive statistics so this is the who provided the ideas so the equal rights journal provided in this sense three three thousand the phrases come from the equal rights journal um etc over the the three groups and the equal rights journal use the phrases quite a bit more often so they of these 3 000 phrases they used it close to 10 000 times in their corpus so there’s some uh distribution in who is providing the phrases which is um interesting and potentially consequential and then here are the frequent um Phrases phrases from each of these groups and i’ve just put in orange ones that i think are kind of interesting so national consumers league not surprisingly is looking at labor laws union education etc the equal rights journal equal rights amendment not surprisingly status which is probably have to do with the commissions on the status of women which they talked about a lot freedom education family in the black writings we unsurprisingly get like colored women and negro women war was a big topic uh civil war freedom slavery community uh body was interesting and i was looking into how body showed up which which was i think an interesting phrase so there’s a lot you could do just simply looking at the difference in the types of topics and phrases these women’s groups were talking about i mean and black women i think are undersold in their role that they played in the first wave women’s movement it was really huge and it was much different than the role that these white middle class women were playing so just in terms of like digging into what are the different things that these groups found important is one thing that we can do before we even get to wikipedia with these phrases that we’re looking at here’s the the same thing but i’m looking only at two grams and up so just throwing out the unigrams here similar thing you know child labor amendment eight hour days not surprisingly focused on labor here jury service the the jury rights movement was big for the equal rights for the national women’s party we get feminist movement here and the feminist movement was largely only mentioned by the national women’s party the word feminism was not used by black women it was not used by the national consumers league which is interesting and we we know that second wave feminist black second wave feminist refused a lot of them refused to use the word feminist and use the word womanist and this goes back into history this goes back to the 1900s when the word feminist was first imported to the united states from france so there’s an interesting historical origin story there and then over here we get industrial education training schools so this is following the booker t washington kind of philosophy of industrial education that type of kind of economic focus the tuskegee institute is important here so we’re seeing that focus come there as well all right here so those were the phrases so now we’ll shift to wikipedia which of these showed up in wikipedia so in all of the phrases all of the pages in wikipedia and this is the full text not just the title just around 90 of these phrases showed up in wikipedia which is great i think which means 10 10 of them did not but 90 did and there’s not much difference across these three groups here in which phrases in the amount or proportion of phrases that showed up and sorry this is across all the entire body of knowledge of wikipedia so like english english english any english language so like whether it was the parks or talking about like hydro chemical what have you like these are phrases that popped up okay yeah that’s right and some of the phrases are pretty generic like abdominal muscles was a phrase that came out in the women’s movement writing which does in fact show up in wikipedia but uh not necessarily in relation to the women’s movement right which is why i took this second step yeah abdominal muscles there was a lot about uh women’s bodies and that actually has to do with health care so a lot about the health care of women um reproductive issues and such so so that’s why and that exactly tees me up to look at the second graph which is i don’t want to just look at all pages but were these issues mentioned in any page that was on a history page it doesn’t have to be history of the women’s movement just a history page so it contains history in the title or it contains movement in the title so it’s an attempt to narrow down wikipedia into more of a history book format so if you read an article you don’t know what you want to search for you just want to know what is the history of the suffrage movement you go to the suffrage movement page and just a note here that i do get the page titles that redirect so one page on wikipedia tends to have three or four pages that redirect to it and so i am searching through all of those titles not just the the the title that it ends up on but all of the pages that then reject direct so the history of women’s suffrage is a title but it redirects to woman suffrage so just just a note there but i do have all of those titles so this is an attempt to say not only do the phrases show up in wikipedia but if you were reading a page on some sort of historical thing would you see these phrases in there and here we see it it’s now reduced down to about 60 percent of the phrases show up in a history or movement page on wikipedia which in my eyes is still quite high but it still leaves a full 40 percent of these phrases that are not showing up in the history pages here uh just to quickly uh clarify so is it i just want to make sure so the the i i thought it was a tag on the wikipedia pages like you know you you label your article as okay this is really to history or this to social movement is that right no this is not from tags this is from the actual title so does the word history or does the word movement appear in the title of the page there are tags so wikipedia is tagged um with things like feminist or movements or women’s issues that could be an alternative route i was looking at that and it it was not i it didn’t make sense to me why some things were tagged as feminist and something’s not it was pretty weird but i do want to look more into using those tags those metadata tags so i think that is something that’s potentially a next step and if anybody here has done research using wikipedia data i would love to hear your opinion on the usability of those tags and how systematic they are because really just doing some spot checking i was like wow that’s that’s kind of a weird list of feminist pages so that’s why i didn’t use those so i’m just going for interpretability and simplicity was just saying does history or movement show up in the title uh are these titles also contributed by internet users or yep okay yeah but they do try to do some you know wikipedia has a bunch of style guides they’re pretty strict about what gets included and what doesn’t and they will the like actual people who work for wikipedia will change things if it doesn’t fit their style guide they’ll either do it in hit by hand or do it by a bot so it’s pretty controlled that’s especially the titles but everything about wikipedia is pretty controlled yeah everything is contributed by the editors so people can add pages and decide on the title and yeah it’s users and then the third way of slicing the Titles data was actually looking at the titles so this is all again all page titles not the history of movement and the question is do the phrase show up in the title of the page why is this important well when you do a google search if the phrase is in the title it’s most likely to be that first hit that you see and it’s most likely to have that little call out box on the right side so if it’s not in the page title the the page is going to be further down in your search terms and you might even have to add the term wiki in there to get the page so it is important if the phrase is in the title versus not it impacts the way people read um and find out about information and so now we’re hovering between 54 to 58 of the phrases are in a at least one page title and again across all of these there’s not much difference and none of this is statistically significant the difference is here across these three collections so this is kind of the meta view of what’s happening with these phrases in wikipedia Not in Wikipedia here are some of the phrases that are not in wikipedia so minimum wage boards industrial hygiene bureaus child wage earners it’s not their mandatory minimum wage law uh jury service bill surprisingly to me was not in wikipedia and anti-lynching hero ada young auxiliary is a pretty niche group that was important to the black women’s movement lucy stone civic league et cetera so just these are some examples of phrases that are not in wikipedia at all at least the way we searched for it oh and by the way we searched for the phrase using elasticsearch so we did an edit distance so it didn’t have to be the exact phrase we allowed an edit distance of one for each word so if there were three words we allowed an edit distance of three and if there were two words we allowed an at a distance of two so there was some room for different spellings etc but not much so it was constrained but that’s how we did the the search of wikipedia with these phrases and of course lowercase did and all of that stuff um here are some phrases that were in wikipedia but not in the history of the or the movement pages so some interesting things here jim crow car which was really really important to the early black women’s movement does not show up in any history page in in wikipedia although it does show up in wikipedia so if you were searching through enough pages you would find it but it’s not really connected to the women’s movement uh social uplift and nobler ideas were also really really key concepts to the black women’s movement that don’t show up in these history pages here same same over here pure food law was very important to the national consumers league night work etc that don’t show up in these history pages on wikipedia so what then is lost to history and this is where i need your folks help and where i’m getting a little stuck so how do i characterize or figure out what’s going on with these phrases and it’s between 10 to 40 percent depending on how you slice the data that don’t show up in wikipedia my my initial kind of look through it and i Qualitative Look did some word counts i did some frequency counts i just did some qualitative look for black women it’s specific i mean the general theme is specificity which is not surprising um that the specificity is lost but this the type of specificity is different across these groups so for black women it was specific clubs organization and departments that were the most likely to not show up in wikipedia for the equal rights journal so this is the national women’s league league the national women’s party it was specific commissions bills and treaties and then for the national consumers league it was specific commissions laws and types of employment so a lot of the specific industries that they were focusing on does not get mentioned in their pages on wikipedia so just to show uh one case study of this one of the things i can do is identify phrases that have a specific word for so departments what departments are being mentioned um this is easy to do computationally so i just was curious if we break it down to specific types of ideas do we see differences across groups so i’m looking at any phrase that mentions the word department or departments Systematic Look and now we start to see some more systematic differences so in all pages a hundred percent of the departments mentioned by the equal rights journal was somewhere in wikipedia i mean there’s some drop off in these other groups but here we start to see a big difference so while around 50 percent in the equal rights journal national consumers league were mentioned in one of these history pages only 35 percent um from the black women’s groups were mentioned in these history pages and a lot of that has to do with things like the negro department which is not mentioned in these these kind of race issues that we don’t associate with the women’s movement race is separate from the women’s movement and we’re starting to see that a lot here where we don’t see a lot of these these departments that maybe are not gender specific but still were crucial to the women’s the black women’s movement were not seen in wikipedia so here are some of the departments that are missing so again color department negro department but there’s some other ones citizen citizenship department as well so immigration was a huge part of the early women’s movement and that a little bit gets lost to the history as well this focus on immigration and nationality etc here and then this is wikipedia titles so not surprisingly these phrases don’t show up in the history page titles because they’re gene generic titles like the history of the women’s movement the history of the colored women so they’re not going to show up in the history pages title but we see a pretty big difference here where 70 of the departments mentioned by the equal rights journal get their own wikipedia page where only 40 of the departments mentioned by the black white writers get their own page so now we’re starting to see if you’re kind of searching through wikipedia it would be much easier to find information on these departments versus the ones that were important to the the black women’s movement here so that here we’re starting to see a little evidence that there is actually systematic differences that don’t get picked up when you’re looking at the phrases as a whole here so very preliminary takeaways and all of this may change in the coming months as i dig deeper into this i’m convinced and still convinced that wikipedia is a really fantastic resource i think it’s just an incredible source of knowledge for our world and they’re so open to making it better and they do campaigns to add more pages so i think it’s just something that we should celebrate and look at more still 10 of the phrases which is about 500 phrases are not present in wikipedia and this i’m going to do some more cleaning so this number will likely go down so it’s it’s pretty dang impressive how comprehensive wikipedia is in covering this movement and certainly if we went back to the 1600s we might find something different but at least for the the modern stuff it’s pretty good not surprisingly specificity is missing and what’s interesting to me is the type of specificity is different across subgroups so specific councils which i didn’t show and departments mentioned by black women are less likely to be in wikipedia this has to do with what these women were talking about so this is also a difference in what the women were proposing across these different communities not just in what wikipedia is covering so it’s both what wikipedia is covering but also what the women were finding important that is different across these and i think that suggests that there are some interesting patterns in this data which may there may be other systematic differences in what’s missing across subgroups of ideas and again this is where i’m a little bit this is where i am now is all right i have these lists of phrases that don’t show up in wikipedia across these three clusters i can’t i don’t have recourse to my normal tools topic models you can’t do on phrases word and bendings doesn’t really make sense so i’m a little bit you know i’m i don’t feel like i can go back to the stuff that i know really well and i’m not quite sure what to do with the phrases now so my big questions are Next Steps about next steps um and what i was just saying so in some slices of the data a full 70 of the phrases don’t appear in wikipedia so if you looked at the departments that showed up in titles for example like how is this just i have to go in by hand is there some way i can think about patterns across these ideas that would help me get kind of a handle on what’s going on with representation in wikipedia from this this is kind of an aside but i’m also bringing in the the role of newspapers so i have the chronicling america data set thinking about that as an intermediate step did the phrase show up in the contemporary chronicling america newspapers and is that kind of an intermediate step to shepherding it into wikipedia so i’m currently running that analysis now to see if i can identify that um chronically in america the ocr is awful awful off awful so that’s a little bit questionable if how much we can do with that Gaps in Archives and just conceptually what to do about gaps in the archives difference in writing styles across group all of the historical issues that you know are important for any sort of historical analysis comes into it here so just really kind of thinking about how we do this type of project when the archives are spotty when there’s just issues of style differences and and such across these these organizations for example i have the journal for the national woman’s party that i have just the writings for black women so of course the content is going to be different and is that kind of going to impact how how we think about what’s covered the fact that they’re different is also indicative of the way we think about history by the way and then just some final thoughts and then i want to i want to get your your thoughts on this so i’m Comparison what i’m thinking of is a targeted comparison comparing the national league collection to the actual national consumers league page equal rights journals the national women’s party page i don’t know how to do that with the black women’s writings because it’s there’s not a page that’s like black women’s writings right so i don’t know what page i would compare there or just look at all three collections as a corpus to the history pages as a corpus and not looking at the phrases or maybe looking at the phrases but just directly doing some comparisons there Network Analysis more than just mentions i mean one potential idea i have is looking at the idea structures measured via network analysis so if you were to read the black women’s writing collection and you were to read the wikipedia history pages what is the structure of ideas you would see and how how is that different so that’s one concrete idea i have and it really does borrow from peter berman’s work on word networks also monica lee and john levy martin’s work on semantic networks and and that sort of thing so that’s one specific idea that i have and that’s it what else what else can i do what how do i get a handle can i ask a question laura yeah so when you think of the the process not you know not in quantitative terms but just sort of in natural language terms what do you what’s your working model in your head of what gets remembered in what doesn’t um Working Model i so i think it would certainly i at least i hypothesized coming in that certain groups what certain groups were doing would be more remembered and recorded because they are preserved in the archives more they are they were publicized at the time they were probably in the uni in the newspapers at the time so that was my big initial hypothesis was there’s going to be some demographic effect here where white women especially middle class white women are going to have more control over what gets remembered in history versus other groups so that’s one big model um yeah actually that is the biggest Rosa Parks Case even i had even like even like rosa parks right like uh my understanding is that her case was publicized as sort of a through a media event that was like deliberately orchestrated like she had organizational support and i wonder if maybe part of part of the puzzle is figuring out who had you know organizational resource support to have their messages magnified you know in the uh in the media sorry i’m just admitting somebody here something like that you know on a related story i remember when i was a graduate student having a couple wikipedia editing wars over uh henry ford’s history of anti-semitism and there they had a very very fast they had like a rapid response team that would contest any mention of henry ford’s anti-semitism almost immediately to the point that i figured it had to be some type of organized effort to you know maintain it so you know organizations might not just platform ideas and help immortalize them but they might actually actively squash them and if i’m wondering if there’s some way to do that by tracing the genealogy of an idea that died or didn’t something like that Wikipedia or the edit pages right we have the entire history of the edit pages of wikipedia so you certainly could pull in that data as well to try and figure out is there organized efforts on either side like the number of edits so i certainly have right away the the last edit that was done on the page so some were edited in 2015 some are currently being edited today so there’s a lot of other data from wikipedia um which i think could be interesting um so yeah can i think just one more i i would wonder if the concepts or people that you see The News Archives featured today if you look through the news archives you would find that they commanded a news cycle like there was a moment where they captured the national attention and were able to entrench themselves in the collective imagination and now you could pick up quantitatively right like just the hypothesis would be that you’d have to dominate a news cycle to get the wide attention that would be required for cultural creators to be cognizant of your existence know your story and then have the hazard of creating it in more content that’s consumed i wonder exactly which is why that chronicling Chronological America america data is key there yeah precisely so is if we’re looking at the 10 that didn’t show up in wikipedia or the whatever that was 20 30 that didn’t show up in these history pages is that conditioned on their not only the presence in the newspapers but the duration of the presence or the geographical reach of the presence where we have the city of the newspapers so that is precisely i think why i would want to bring in the chronoclean america data there okay may i ask a question Trending don’t you guys hear me now did you see any trending between uh the older the newer stuff the newer references for your list of terms being more relevant like maybe old phrases kind of started to evolve into something new Confusion yeah the so um there is some confounding there specifically with the way we refer to the black population we don’t use the term colored anymore we don’t use the term negro anymore and that’s not the same way with white people we have always used the phrase white and we will always from the foreseeable future use the phrase white i mean women weren’t saying caucasian they weren’t saying the words caucasian there we were using negro and colored so that is certainly an issue there specifically with like colored women workers doesn’t show up in wikipedia but black women workers might and so that that changes so that’s an important caveat um and it’s an interesting problem for wikipedia in general is when you talk about these historical events do you use the phrase that they use themselves like colored women workers or do you update that to the modern language which is black women workers or african-american women workers which probably nobody would want to use now so that’s that’s a big question um with the other stuff with the other kind of phrases and departments you would think that at least with the departments in particular you would use the actual department name if there was a negro department you would want to talk about the negro department in a history book so there’s you know different different ways of thinking about the way phrases change over time tricky to capture computationally i would say these phrases changes over time and i’m not sure what i would do with that maybe look yeah newspapers i don’t know um my time frame is actually not all that large it’s 189 1898 to 1954. um so there may be some kind of bias toward the 50s but my guess is actually there’s more bias to the first wave movement because that’s where most of history is done so 1890 to 1920 would be my guess but i actually could look more into year of creation can i jumping yeah so um this is a Perspective really very exciting and amazing project i really really like it it’s really impressive um but like just to contribute one um perspective maybe so the way i see black women writing is like black women is in the intersection of racial discrimination and gender discrimination right so like right now here the reference group seems to be like equal rights journal or some national consumers like i the the the comparative perspective i think you can further deconstruct further break it down to say what’s the difference between black women’s writing versus black guys writing would they draw even more attention like say for like for for for males right black males their uh their movements or their writing would actually draw even more attention than black women and then also black women versus white women like i see like there are multiple dimensions going on there and um and also another because i just now i i think at one point you mentioned a lot of phrases they do appear like they both appear in black women’s writing and also in wikipedia but it’s not in wikipedia’s um history or movement page or whatever so like where do they appear then what are they associated with so they do appear in wikipedia but not associated with black women’s writing and then they appear somewhere else like seems like they were isolated from uh from their original contacts then where did they go and why did they go there so i think all those directions are worth dipping um like diving deeper into and um what else uh oh and another like one last thing but i think you have already mentioned this because you are now um also shifting your attention to compare this with uh contemporary writing or contemporary memories i would say because like right now we thought something is lost to history right but probably it’s also because we don’t even they just go uh they they are going actually unnoticed even at present like right now are we also remembering black women’s writing at this time like at present i like like my my so i think the point is are they really lost to history which which suggests that they were once remembered they noticed and then they went lost or they actually they didn’t get any attention at all and they just you know disappear so um that’s like several dimensions i was thinking through yeah and that that Stop Sharing relates back to do you guys prefer that i stopped sharing my screen so we can see each other or should i keep myself that’s great yeah The Initial Process uh stop sharing is great yeah sure yeah yeah i always find it so weird talking to these little tiny boxes um yeah so that relates back to joseph joe or joseph joe joe’s comment that i’m looking at the initial process so rosa parks it wasn’t arbitrary that her event became the event it was actually you know they came to her they specifically the the movement came to her and said rosa parks we need you to get arrested because we’re going to turn your case into a thing and she had a really great history she was organized you know she didn’t have any problematic features she was a upstanding citizen etc so it’s she didn’t just get tired and give up her seat the movement said you need to do this rosa parks and we are going to be behind you so there’s there’s a process beyond just simply what do we remember in the process and so chronicling america i think would again get at that was it remembered at the time so up the 10 of the phrases that aren’t in wikipedia were they not even mentioned by newspapers like was it truly just a non-starter and nobody was talking about it even at the time so i think from this conversation it’s i that seems to be the essential next step is bringing in this chronicling america data so i think first first response i’m getting is that should be my focus is looking at that data because that will be essential um yeah there’s so many different comparisons here you know the one of the issues with this is this came out of the work on intersectionality is that the when we think about the women’s movement we think only about gender and so the race work that black women were doing which was they were being discriminated against because they were black not just because they were women it was so important and in many cases was more important because we’re talking post reconstruction here and just the absolutely horrific stuff that was happening in the south with the rise of the ku klux klan lynching was a huge part of this black woman’s writings and who are we to say that that’s not part of the women’s movement because it was i mean it was led by black women ida b wells was one of the foremost people who were doing this stuff so that that process is really interesting to me that you know we we construct the women’s movement as being so narrow compared to what the participants were thinking about and that’s one of the things that i really am interested in here and why that difference in the history pages was so important to me the fact that there was a huge drop in which of the departments mentioned by black women were in those history pages was a very startling finding to me and it was not surprising unfortunately but it was startling and i think that really starts to get a handle on how we say this is the women’s movement this is the history and in fact that was all history right it could have included men’s history and and the history of race movements and still we see this big drop off so that’s now where i think there’s something there there’s something happening and we can really kind of get a handle on how we’re defining history and such so where does one of the questions was like where does that go so the phrases that were in wikipedia but not in those history pages some of them you know some of these groups have their own pages and history is not in the title so like the lucy stone league might have its own page but when you read the history of the women’s movement page you never see the lucy stone league so how would you go from the history to that page so now it becomes a matter of like how how much knowledge do you have of the movement and would you know to search for that page probably not if you’re trying to learn about it so then then it becomes kind of like a network of knowledge thing i have a question if you’re looking for stuff that is mentioned by history then it strikes me that that is something that is remembered within the history genre or within the institution of academic history it can still be part of history like our cultural history though not recognized as part of history through you know instant formal history or institutionalized history and i think that might be part of it i got to think anything that exists today has a traceable echo going through and somebody made it survive somehow a journalist it was in a big book there’s a museum dedicated to it you know yeah it looks like it’s a history tracing type of thing and i i do think that that 10 once i do some more targeted cleaning i mean there is a question and i’m interested in particularly from computational folk how much cleaning you do like how much do you stick to the original phrases versus so for example black women in particular always used the miss and mrs miss ida b wells mrs ida b wells barnett that doesn’t show up in wikipedia so i do want to clean out that miss and mrs but in doing that i’m making a choice to say i’m not accepting the way you phrased it and that’s like a clear example but there’s other ones as well so that is a question to me but anyways that was a sidetrack once i clean that i actually think that that number will go down to five maybe even three percent and i think you’re right like virtually everything is recorded which is impressive and um exciting but that doesn’t mean it’s easily accessible you may have to dig deep deep deep into wikipedia and be really knowledgeable in in order to find these kind of what we would see as esoteric pages and why not link to them all in the history page right you can do hyperlinks within wikipedia and so what’s what’s being chosen in the institutionalized history kind of version versus the we can dump all of the knowledge we’ve had ever on the internet which is fantastic but it’s not the same as putting it in the history page which is why i wanted to zero in on that you know i i have this some idea that i would want to look at actual history textbooks that are taught to kids right and so looking at the history pages is as close to an approximation of that that i can get and lots of people do look at history textbooks um so there’s there’s that as well and i and i i do think theoretically and substantively and empirically it is important whether it’s its own page deep in wikipedia versus in kind of these two general places where it would be the first place you go if you were trying to look at that history so that’s definitely the sociology of knowledge institutionalization organizations everything that’s kind of my jam looking at institutions um i love this by the way i really do um one of the things i was thinking of was um maybe even who feels the the ability or the cultural capital to add to wikipedia themselves yeah you know that yes it’s not a formalized as a textbook right but that it could have an impact and especially if you’re seeing a difference between black women’s rights movement and the centered white women’s rights movement um it could be who feels their voice to speak even now never mind back then yeah 100 and we know that women are vastly underrepresented as editors on wikipedia and i’m imagining black women are even less represented and so that perpetuates what i was just talking about which is we decide specifically white people decide what the women’s movement is which gets then into the history pages on wikipedia and either because of resources that people just black women don’t have the time to edit wikipedia or they don’t feel empowered or they don’t feel invited they don’t do it and so that voice does not get inserted into our official knowledge and you know i can do that by gender you can look at the editors by gender getting the race of editors is incredibly tricky but 100 i think that that is the story and wikipedia is really great they want to include being more inclusive in their editors and so i think like this research would be very welcomed by them they would say okay here are some gaps now we know kind of where we can maybe target and improve our wikipedia pages i don’t think they would be upset to hear things like this they would be very open to it and i know that one of the things that professors sometimes do is they encourage their students to edit a wikipedia page to get them as editors to give them an account to get them in the process so one thing that you know i could do or somebody could do is say let’s go edit the history pages and make sure that this history is being accurately linked to right that could be one potential outcome of this is identifying where we can improve some of these pages and literally shape the way that the world reads history i mean it can be incredibly impactful one trillion people are clicking on wikipedia every month if one trillion not one trillion if a fraction of those look at the history of the women’s movement page and they’re not seeing those black departments and we put those in that actually could be pretty impactful i kind of want to piggyback on christine’s comment um i had a good friend in grad school who did a dissertation on wikipedia uh sort of flame wars right so and particularly in contentious articles like you could think of like uh us abortion rights for instance where in the early days of of wikipedia without sort of this the infrastructure that exists now at least to my limited understanding were highly contentious like minute by minute edits um uh by people sort of on you know either side of the debate right and i think like when i’m hearing um like one of the things i think might be helpful to push and advance the paper and i know i know this may be kind of annoying too because like this actually might push in a slightly different direction and everyone hates those kind of comments um but like i’m thinking like how wikipedia is sort of like the last triangular part in in in this piece and i think that like even though it is a really cool data source um but i think it’s also part of the story itself and i think like um because you’re talking about like me like you know leveraging the metadata and i’m kind of wondering you know like how the editors themselves are a big part of of of this story right as sort of like this gatekeeping mechanism right um and i think like yeah yeah you may not be able to get race but there are actually a lot of um more improved algorithms that can actually sort of infer ethnicity now that i know like um there’s like the officer piece and pnas where they use that to kind of infer diversity um and i think like you may be able to kind of like gleam at least part of it um because i think that actually might be sort of a big part that at least for viewers i would imagine would sort of harp on saying well you know like the you know like who get who gets initially put into wikipedia um you know if you would do some kind of temporal analysis of like when these different articles start to show up is is probably a product of you know whom are the who whom are the editors um and kind of conversely to that i’m also kind of wondering you know joe and ian shan kind of hinted at this um i’m kind of wondering if you could actually look at um you know one cheap easy readily available metadata repository would be academic articles so you could imagine specific journals of history related to black history maybe articles journals that are created by hbcus um like they would probably be the best um sources obviously i would imagine they would probably be more available past the 1950s but like i think that would actually be a really great source of sort of you know sort of peripheral uh marginalized knowledge that’s well maintained well curated and data you could probably readily um curate to at least kind of corroborate some of these trends or you know potentially even as an additional source although that’s again bad me pushing a different direction but i think like that i that that some like data where there is a community of scholars who are sort of curating knowledge that is obviously not in in sort of into the mainstream that you would that would be reflected therefore in wikipedia sort of like in this like fleck kind of you know esoteric exoterra kind of circles right of of of of knowledge yeah and jstor now has a research bench where you have access to the entire jstor full text of the entire jstor which is really oh cool oh it’s a great resource um so so this it would be chronicling america to kind of pick up the contemporary attention economy it would be jstor to pick up the academic attention economy and then looking at how that triangulates or compares to wikipedia which is the whatever we want to call that popular attention economy yeah i think of like you know like exoteric and esoteric circles where it’s sort of you know academics who are kind of at the core of like producing and maintaining and perpetuating knowledge and then more sort of others which are more you know common folk more activists more you know sort of using the knowledge in different ways i think that might be a kind of a like maybe a helpful frame for it can i pick you back on on on that idea uh well i i have two comments one is uh if it gets narrowed down to names like and the question is who gets remembered i wonder if that would be way way easier to do like a way smaller job of cleaning yeah yeah and then the uh the other thing is is i wonder if you go into the wikipedia edit histories of a lot of these pages i remember from my own experience as a wikipedian a lot of esoteric topics uh are developed mainly by a limited number of super contributors and i would wonder if there are super contributors who have read specific books and basically integrated into the wikipedia content and you can find out like the people who these super contributors are contributing have they been mentioned in a book a history book or something like that uh i wonder if that’s sort of the mechanism a mechanism that might be at work here well that’s i mean so if we think about imdb can you guys imagine the most well researched movie on imdb no no star wars oh of course and it’s it’s these really nerdy groups of white boys who love star wars and will detail in great imdb is similar to wikipedia it’s user uh yes okay so not just white but it’s usually men more women these days actually as it’s becoming disney fied more women are getting into it but so um imdb is similar where it’s user content users contribute um and put in these like details like really obscure details about movies and star wars is really well detailed i mean the most esteric fact about star wars you could find on imdb but you probably couldn’t for i don’t know a lot of these not that type of movie things um and i’m guessing a similar thing happens on wikipedia where you just get these nerds who are super interested and they’re like i’m gonna make sure wikipedia has every freaking detail that i can imagine and it’s well cited and all of that and as uh somebody sorry i can’t see the that sophia was saying um or somebody was saying it’s not black women who are doing this they’re not often the ones taking the time to be really super nerdy about it and making sure it all gets on wikipedia um so that i think is getting a mechanism and i can at least say at least from my friend’s dissertation that she found overwhelmingly the super editors that sort of joe referenced are overwhelmingly um white men uh in the us at least in the in the english language like so it’s it’s it’s not um a very diverse uh heterogeneous uh group one other kind of thought that came to mind too is like when you were talking about star wars i know like star wars and like other kind of like fandoms like star trek which i’m a big fan of um like the walking dead other things have like these wikipedia ecosystems where they are yeah yeah yeah they have their own which are sort of an independent system a rather eco knowledge ecosystem than wikipedia and i don’t know like maybe is there something similar to like to women’s movements where if there’s like a unique history wikipedia of like i don’t know like of like traditional black history women’s history that may be curated because there seems to be a wikipedia for everything these days i’m kind of wondering how that would corroborate with like standard wiki like sort of the mainstream wikipedia yeah i have no idea actually i’m i’m not in that wiki the wikis not the wikipedia but the wikis i’m really curious if there’s a women’s movement wiki and that so that is something i’m going to look up right away so i know there’s i’m just a few minutes left but i’m still and this gives me all sorts of ideas going forward i’m still a little stuck on yes i can look at who is mentioned which i think is much more tractable and easier to classify and we can very easily say you know this is a black woman white woman so absolutely but i i really do want to think about ideas as well like equal rights like social uplift so are there any thoughts on how i deal with not people and organizations and departments which i have a clear idea in my head how i would do that but clustering ideas grouping ideas looking for themes across ideas keeping in mind that i want to bring in chronicling america and maybe now even jstor i mean can i i still think actually networks are my best approach for this in some ways because i don’t want to lose that i really do want to think about not just people but ideas that are are proposed i would do i would defer to my big data colleagues which i’m not one of but maybe one way to do this is as an exploratory technique where you just identify things that lived and died and have to trace them out qualitatively on some level i know that you’re looking for a you know computational fixture i don’t mind qualitative analysis that’s a big part of my research pipeline so it would really just be looking at these case studies and saying here’s a cluster of ideas from these different collections that didn’t make it and just kind of really doing a deep dive i don’t mind that actually doing just a few case studies i think that could be really potentially very interesting yeah i mean to me that sort of like seems like the lowest hanging fruit would be to like do at least a case study because both it’s going to be both an analytic and methodological effort to once you identify what that is and then learning how to scale it to identify to then identify these other like these uh these other um ideas in your corpus so i think like you know maybe starting out with something that’s like well concrete and well-defined and kind of like working reverse engineering it might be the best approach i like that and then i can bring in all of these ideas about like who’s editing it who is not editing it and and all that other metadata because throwing all of that stuff in with 5700 phrases it’s going to be a lot of noise ouch go ahead no don’t go ahead you go um i have a silly idea which may not be helpful but it’s similar to charities but it’s in the opposite direction so i happen to write a couple of children’s books for women’s suffrage movement because this you happen to be yeah right so i noticed that you know like in children’s some children books have a you know glossary on the back and it kind of you know summarize something like the key events or key ideas like uh right because it’s targeted like a children not like a high school history book all like active academic work right so it has to kind of boil down to uh the most important significant ideas in this in the children young women’s celebrity movement so i think that maybe to me that’s very helpful you know as someone who is learning american history by reading children’s book and another thing is that what i realized that i think it was after i read the after i read like four or five books on the same topic and that i came across to a book that i mentioned that has a chapter talking about this uh african-american woman figure in the women’s suffrage movement so i didn’t think about it back in that but i wonder if the the the first like four or five books that didn’t imagine this african-american women was published in the early period maybe you know 90s or in the early 2000s and as we are constantly rediscovering history right so let’s go back to joe’s idea that the materials are always there right people made a mark on history materials idea but uh as contemporary people scientists we are just rediscovering materials so my impression is that since the movie hidden figure came out i’ve seen many more books about oh there’s you know female computer scientists right who contribute a lot in the early days of the computer error and so we are just constantly discovering the the history so this idea like how we are the process of rediscovering history itself it’s kind of like uh i don’t know it’s kind of interesting like uh why four books didn’t mention that very african-american women at all and then maybe i think the the book i read talking about this african-american moment maybe part maybe published you know like uh very recently so i mean it’s and it’s socially determined right it’s black lives matter movement goes up and the history mentions of the i mean yeah it’s absolutely not arbitrary what we rediscover and remember i mean the racy taylor stuff came out on the the tales of the metoo movement where it was like actually black women in the civil rights movement got their start in sexual harassment movements and that was not arbitrary that that came out in that that period when everybody was talking about it i love the idea of glossaries actually in indexes so now i’m going to think about that as a potential because that’s a cool ready-made like cluster of ideas um book indexes and stuff that’s great it’s a great idea to jump over what charlie was saying before about a case study maybe if you look at one that didn’t make it at all but like in history we remember the winners right like the rosa park there was a successfulness out of it right and we don’t remember really the one that wasn’t successful so almost deconstructing in a way some of the ideas that you have or the key words that you have those movements or what they’re discussing that didn’t make it you know that weren’t successful we always remember the winners and to see the difference with that i think is super interesting especially with the idea that’s really out there right now about decolonizing education about this what we remember is being told by the successful movements or the successful people rather than the unsuccessful people so understanding how much falls off based on that i’m just now trying to think of a way to measure success because there’s also the the science of science literature and sociology of innovation literature that talks about like the death of ideas obviously in this case it’s science right um i’m blanking on some references i could i’ll send them to you but like um like this is actually of increased interest to see like you know like which which which concepts which pop things actually fizzled um so granted like in the case of science but i mean there is like you know social movements um there is like an intersection within the sort of the sociology of science so i think that actually might be at least like some theoretical use to you to kind of kind of uh kind of piggyback with christine was saying to kind of think about a little bit too yeah definitely death of ideas i think we’re at our uh bewitching hour i don’t want to keep anyone over um but we’ll definitely send you any emails if we have a late night inspiration of of something to look at but um laura thank you so much for for joining us this is super fascinating please come back again once uh you know as you develop this we’d love to see host you again hear this idea or hearing new ideas um it’s been this has been really fantastic well it’s been way more helpful for me than it probably was for you so i really don’t know i really appreciate the opportunity to just bounce these ideas off an expert crowd and figure out what to do next because i was a little bit stuck but now i’m not stuck i have lots of ideas that’s what we like to hear and i love the rake thing the right thing was like i i actually was not familiar with it so i was like super i love rake i i mean it’s a lot of false positives so just be cautious but the true positives are freaking gold sometimes so yeah that’s great i i was really like i was like there’s no way this is going to work no way this is going to work and then i was like wow this actually worked this is great which is where in like a lot of like nlp stuff because like most of the stuff out there like it’s usually kind of crap i’m sure you and john you could test you too as well like it’s a lot of it’s not that great but you want it when it hits well it hits them all you’re like wonderful all right thank you so much thanks so much take care everyone