• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tag Archives: Find

Did you find everything you need today?

April 10, 2021   Humor

Posted by Krisgo

About Krisgo

I’m a mom, that has worn many different hats in this life; from scout leader, camp craft teacher, parents group president, colorguard coach, member of the community band, stay-at-home-mom to full time worker, I’ve done it all– almost! I still love learning new things, especially creating and cooking. Most of all I love to laugh! Thanks for visiting – come back soon icon smile Did you find everything you need today?


Posted on April 9, 2021, in Animals, Awkward, Scary, Shopping. Bookmark the permalink. Leave a comment.

Let’s block ads! (Why?)

Deep Fried Bits

Read More

ImageNet creators find blurring faces for privacy has a ‘minimal impact on accuracy’

March 16, 2021   Big Data
 ImageNet creators find blurring faces for privacy has a ‘minimal impact on accuracy’

The power of audio

From podcasts to Clubhouse, branded audio is more important than ever. Learn how brands are increasing customer loyalty and personalization with these best practices.

Register Now


Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


The makers of ImageNet, one of the most influential datasets in machine learning, have released a version of the dataset that blurs people’s faces in order to support privacy experimentation. Authors of a paper on the work say their research is the first known effort to analyze the impact blurring faces has on the accuracy of large-scale computer vision models. For this version, faces were detected automatically before they were blurred. Altogether, the altered dataset removes the faces of 562,000 people in more than a quarter-million images. Creators of a truncated version of the dataset of about 1.4 million images that was used for competitions told VentureBeat the plan is to eliminate the version without blurred faces and replace it with a version with blurred faces.

“Experiments show that one can use the face-blurred version for benchmarking object recognition and for transfer learning with only marginal loss of accuracy,” the team wrote in an update published on the ImageNet website late last week, together with a research paper on the work. “An emerging problem now is how to make sure computer vision is fair and preserves people’s privacy. We are continually evolving ImageNet to address these emerging needs.”

Computer vision systems can be used for everything from recognizing car accidents on freeways to fueling mass surveillance, and as ongoing controversies over facial recognition have shown, images of the human face are deeply personal.

Following experiments with object detection and scene detection benchmark tests using the modified dataset, the team reported in the paper that blurring faces can reduce accuracy by 13% to 60%, depending on the category — but that this reduction has a “minimal impact on accuracy” overall. Some categories that involve blurring objects close to people’s faces, like a harmonica or a mask, resulted in higher rates of classification errors.

“Through extensive experiments, we demonstrate that training on face-blurred does not significantly compromise accuracy on both image classification and downstream tasks, while providing some privacy protection. Therefore, we advocate for face obfuscation to be included in ImageNet and to become a standard step in future dataset creation efforts,” the paper’s coauthors write.

An assessment of the 1.4 million images included in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset found that 17% of the images contain faces, despite the fact that only three of 1,000 categories in the dataset mention people. In some categories, like “military uniform” and “volleyball,” 90% of the images included faces of people. Researchers also found reduced accuracy in categories rarely related to human faces, like “Eskimo dog” and “Siberian husky.”

“It is strange since most images in these two categories do not even contain human faces,” the paper reads.

Coauthors include researchers who released ImageNet in 2009, including Princeton University professor Jia Deng and Stanford University professor and former Google Cloud AI chief Fei-Fei Li. The original ImageNet paper has been cited tens of thousands of times since it was introduced at the Computer Vision and Pattern Recognition (CVPR) conference in 2009 and has since become one of the most influential research papers and datasets for the advancement of machine learning.

The ImageNet Large Scale Visual Recognition Challenge that took place from 2010 to 2017 is known for helping usher in the era of deep learning and leading to the spinoff of startups like Clarifai and MetaMind. Founded by Richard Socher, who helped Deng and Li assemble ImageNet, MetaMind was acquired by Salesforce in 2016. After helping establish the Einstein AI brand, Socher left his role as chief scientist at Salesforce last summer to launch a search engine startup.

The face-blurring version marks the second major ethical or privacy-related change to the dataset released 12 years ago. In a paper accepted for publication at the Fairness, Accountability, and Transparency (FAccT) in 2020, creators of the ImageNet dataset removed a majority of categories associated with people because the categories were found to be offensive.

That paper attributes racist, sexist, and politically charged predictions associated with ImageNet to issues like a lack of diversity in demographics represented in the dataset and use of the WordNet hierarchy for the words used to select and label images. A 2019 analysis found that roughly 40% of people in ImageNet photos are women, and about 1% are people over 60. It also found an overrepresentation of men between the ages of 18-40 and an underrepresentation of people with dark skin.

A few months after that paper was published, MIT deleted and removed another computer vision dataset, 80 Million Tiny Images, that’s over a decade old and also used WordNet after racist, sexist labels and images were found in an audit by Vinay Prabhu and Abeba Birhane. Following an NSFW analysis of 80 Million Tiny Images, that paper examines common shortcomings of large computer vision datasets and considers solutions for the computer vision community going forward.

Analysis of ImageNet in the paper found instances of co-occurrence of people and objects in ImageNet categories involving musical instruments, since those images often include people even if the label itself does not mention people. It also suggests the makers and managers of large computer vision datasets take steps toward reform, including the use of techniques to blur the faces of people found in datasets.

On Monday, Birhane and Prabhu urged coauthors to cite ImageNet critics whose ideas are reflected in the face-obfuscation paper, such as the popular ImageNet Roulette. In a blog post, the duo detail multiple attempts to reach the ImageNet team, and a spring 2020 presentation by Prabhu at HAI that included Fei-Fei Li about the ideas underlying Birhane and Prabhu’s criticisms of large computer vision datasets.

“We’d like to clearly point out that the biggest shortcomings are the tactical abdication of responsibility for all the mess in ImageNet combined with systematic erasure of related critical work, that might well have led to these corrective measures being taken,” the blog post reads. VentureBeat asked the coauthors for comment about criticisms from Birhane and Prabhu. This story will be updated if we hear back.

In other work critical of ImageNet, a few weeks after 80 Million Tiny Images was taken down, MIT researchers analyzed the ImageNet data collection pipeline and found “systematic shortcomings that led to reductions in accuracy.” And a 2017 paper found that a majority of images included in the ImageNet dataset came from Europe and the United States, another example of poor representation of people from the Global South in AI.

ILSVRC is a subset of the larger ImageNet dataset, which contains over 14 million images across more than 20,000 categories. ILSVRC, ImageNet, and the recently modified version of ILSVRC were created with help from Amazon Mechanical Turk employees using photos scraped from Google Images.

In related news, a paper by researchers from Google, Mozilla Foundation, and the University of Washington analyzing datasets used for machine learning concludes that the machine learning research community needs to foster a culture change and recognize the privacy and property rights of individuals. In other news related to harm that can be caused by deploying AI, last fall, Stanford University and OpenAI convened experts from a number of fields to critique GPT-3. The group concluded that the creators of large language models like Google and OpenAI have only a matter of months to set standards and address the societal impact of deploying such language models.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Researchers find that large language models struggle with math

March 9, 2021   Big Data

The power of audio

From podcasts to Clubhouse, branded audio is more important than ever. Learn how brands are increasing customer loyalty and personalization with these best practices.

Register Now


Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


Mathematics is the foundation of countless sciences, allowing us to model things like planetary orbits, atomic motion, signal frequencies, protein folding, and more. Moreover, it’s a valuable testbed for the ability to problem solve, because it requires problem solvers to analyze a challenge, pick out good methods, and chain them together to produce an answer.

It’s revealing, then, that as sophisticated as machine learning models are today, even state-of-the-art models struggle to answer the bulk of math problems correctly. A new study published by researchers at the University of California, Berkeley finds that large language models including OpenAI’s GPT-3 can only complete 2.9% to 6.9% of problems from a dataset of over 12,500. The coauthors believe that new algorithmic advancements will likely be needed to give models stronger problem-solving skills.

Prior research has demonstrated the usefulness of AI that has a firm grasp of mathematical concepts. For example, OpenAI recently introduced GPT-f, an automated prover and proof assistant for the Metamath formalization language. GPT-f found new short proofs that have been accepted into the main Metamath library, the first time a machine learning-based system contributed proofs that were adopted by a formal mathematics community. For its part, Facebook also claims to have experimented successfully with math-solving AI algorithms. In a blog post last January, researchers at the company said they’d taught a model to view complex mathematical equations “as a kind of language and then [treat] solutions as a translation problem.”

“While most other text-based tasks are already nearly solved by enormous language models, math is notably different. We showed that accuracy is slowly increasing and, if trends continue, the community will need to discover conceptual and algorithmic breakthroughs to attain strong performance on math,” the coauthors wrote. “Given the broad reach and applicability of mathematics, solving math datasets with machine learning would be of profound practical and intellectual significance.”

To measure the problem-solving ability of large and general-purpose language models, the researchers created a dataset called MATH, which consists of 12,500 problems taken from high school math competitions. Given a problem from MATH, language models must generate a sequence that reveals the final answer.

 Researchers find that large language models struggle with math

Above: A comparison of a MATH dataset problem with problems from DeepMind’s Mathematics Dataset and a Metamath module.

Image Credit: MATH

Problems in MATH are labeled by difficulty from 1 to 5 and span seven subjects, including geometry, algebra, calculus, statistics, linear algebra, and number theory. They also come with step-by-step solutions so that language models can learn to answer new questions they haven’t seen before.

Training models on the fundamentals of mathematics required the researchers to create a separate dataset with hundreds of thousands of solutions to common math problems. This second dataset, the Auxiliary Mathematics Problems and Solutions (AMPS), comprises more than 100,000 problems from Khan Academy with solutions and over 5 million problems generated using Mathematica scripts based on 100 hand-designed modules. In total, AMPS contains 23GB of content.

As the researchers explain, the step-by-step solutions in the datasets allow the language models to use a “scratch space” much like a human mathematician might. Rather than having to arrive at the correct answer right away, models can first “show their work” in partial solutions that step toward the right answer.

Even with the solutions, the coauthors found that accuracy remained low for the large language models they benchmarked: GPT-3 and GPT-2, GPT-3’s predecessor. Having the models generate their own solutions before producing an answer actually degraded accuracy because while many of the steps were related to the question, they were illogical. Moreover, simply increasing the amount of training time and the number of parameters in the models, which sometimes improves performance, proved to be impractically costly. (In machine learning, parameters are variables whose values control the learning process.)

This being the case, the researchers showed that step-by-step solutions still provide benefits in the form of improved performance. In particular, providing models with solutions at training time increased accuracy substantially, with pretraining on AMPS boosting accuracy by around 25% — equivalent to a 15 times increase in model size.

“Despite these low accuracies, models clearly possess some mathematical knowledge: they achieve up to 15% accuracy on the easiest difficulty level, and they are able to generate step-by-step solutions that are coherent and on-topic even when incorrect,” the coauthors wrote. “Having models train on solutions increases relative accuracy by 10% compared to training on the questions and answers directly.”

The researchers have released MATH and AMPS in open source to, along with existing mathematics datasets like DeepMind’s, spur further research along this direction.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Tips on using Advanced Find in Microsoft Dynamics 365

February 26, 2021   CRM News and Info

A lot of small and midsize companies have implemented Microsoft Dynamics 365. The tool is scalable, flexible, automate more than half of the daily tasks, makes the sales pipeline visible, the customer journey shorter, and whatnot. And as I have spent quite a few years in the Dynamics 365 consulting field, I have noticed that one of the most commonly used features in Dynamics CRM is Advanced Find. A developer, administrator, or end-user uses it for various purposes.

Tip # 1

For an entity, if there is a frequently used query, we can set that query up as the default query. This will help the users save some time by having the query configured beforehand so that they would not have to build it manually. Before configuring a default query, when you open the Advanced Find it will look like this:

1 7 625x92 Tips on using Advanced Find in Microsoft Dynamics 365

Now let’s say there is a frequently used query for the accounts entity, which we need to set as default. The query is My Accounts which has incomplete information (the field for an email address or phone number is left blank).

To configure this, go to Settings/Customizations/Customize the System. Open the views in the Account Entity. Double click on the view with the “Advanced Find View” type.

2 8 625x212 Tips on using Advanced Find in Microsoft Dynamics 365

Configure the filter criteria as per your requirement. In this case, the filter criteria is:

3 8 625x418 Tips on using Advanced Find in Microsoft Dynamics 365

Once the filter criteria is configured, click on Ok. Then Save and Close and after that, Publish All Customizations.

4 6 625x273 Tips on using Advanced Find in Microsoft Dynamics 365

Now if you open advanced find and look for Accounts, it will look like the following:

5 6 625x252 Tips on using Advanced Find in Microsoft Dynamics 365

Tip # 2:

If you are trying to find the results of a saved query, by default you will not be able to edit the query. That is, if you want to add some extra filter criteria/remove filter criteria you will not be able to do it.

6 5 625x134 Tips on using Advanced Find in Microsoft Dynamics 365

If you want to edit the query, you can just click on the details. That will help you edit the query. After clicking it will look like this:

7 5 625x175 Tips on using Advanced Find in Microsoft Dynamics 365

From the image, you can see that it is editable.

If you want your saved Advanced Find View or any other saved view to show up as editable by default (so that you don’t need to click on the Details option to make it editable) then you can change it in your Personal Settings, as shown below:

8 4 625x134 Tips on using Advanced Find in Microsoft Dynamics 365

Set the “Default mode in Advanced Find” as “Detailed”. Click Ok.

9 4 625x448 Tips on using Advanced Find in Microsoft Dynamics 365

If you set it to the “Simple” mode, it will be non-editable.

Dynamics 365 isn’t just an application anymore. It has for a lot of companies, be it large corporates or SMBs has become a part of their business strategy. If you would like to know more about how we are helping businesses transform, do connect with us.

Talking about ‘finds’, the Quick search functionality in Dynamics 365 usually shows all Active fields instead of the one you select. This can be a little annoying. Our blog provides a solution to this issue.

For support, please Contact Us

Let’s block ads! (Why?)

CRM Software Blog | Dynamics 365

Read More

Studies find bias in AI models that recommend and diagnose diseases

February 19, 2021   Big Data
 Studies find bias in AI models that recommend and diagnose diseases

Research into AI- and machine learning model-driven methods for health care suggests that they hold promise in the areas of phenotype classification, mortality and length-of-stay prediction, and intervention recommendation. But models have traditionally been treated as black boxes in the sense that the rationale behind their suggestions isn’t explained or justified. This lack of interpretability, in addition to bias in their training datasets, threatens to hinder the effectiveness of these technologies in critical care.

Two studies published this week underline the challenges yet to be overcome when applying AI to point-of-care settings. In the first, researchers at the University of Southern California, Los Angeles evaluated the fairness of models trained with Medical Information Mart for Intensive Care IV (MIMIC-IV), the largest publicly available medical records dataset. The other, which was coauthored by scientists at Queen Mary University, explores the technical barriers for training unbiased health care models. Both arrive at the conclusion that ostensibly “fair” models designed to diagnose illnesses and recommend treatments are susceptible to unintended and undesirable racial and gender prejudices.

As the University of Southern California researchers note, MIMIC-IV contains the de-identified data of 383,220 patients admitted to an intensive care unit (ICU) or the emergency department at Beth Israel Deaconess Medical Center in Boston, Massachusetts between 2008 and 2019. The coauthors focused on a subset of 43,005 ICU stays, filtering out patients younger than 15 years old who hadn’t visited the ICU more than once or who stayed less than 24 hours. Represented among the samples were married or single male and female Asian, Black, Hispanic, and white hospital patients with Medicaid, Medicare, or private insurance.

In one of several experiments to determine to what extent bias might exist in the MIMIC-IV subset, the researchers trained a model to recommend one of five categories of mechanical ventilation. Alarmingly, they found that the model’s suggestions varied across different ethnic groups. Black and Hispanic cohorts were less likely to receive ventilation treatments, on average, while also receiving a shorter treatment duration.

Insurance status also appeared to have played a role in the ventilator treatment model’s decision-making, according to the researchers. Privately insured patients tended to receive longer and more ventilation treatments compared with Medicare and Medicaid patients, presumably because patients with generous insurance could afford better treatment.

The researchers caution that there exist “multiple confounders” in MIMIC-IV that might have led to the bias in ventilator predictions. However, they point to this as motivation for a closer look at models in health care and the datasets used to train them.

In the study published by Queen Mary University researchers, the focus was on the fairness of medical image classification. Using CheXpert, a benchmark dataset for chest X-ray analysis comprising 224,316 annotated radiographs, the coauthors trained a model to predict one of five pathologies from a single image. They then looked for imbalances in the predictions the model gave for male versus female patients.

Prior to training the model, the researchers implemented three types of “regularizers” intended to reduce bias. This had the opposite of the intended effect — when trained with the regularizers, the model was even less fair than when trained without regularizers. The researchers note that one regularizer, an “equal loss” regularizer, achieved better parity between males and females. This parity came at the cost of increased disparity in predictions among age groups, though.

“Models can easily overfit the training data and thus give a false sense of fairness during training which does not generalize to the test set,” the researchers wrote. “Our results outline some of the limitations of current train time interventions for fairness in deep learning.”

The two studies build on previous research showing pervasive bias in predictive health care models. Due to a reticence to release code, datasets, and techniques, much of the data used to train algorithms for diagnosing and treating diseases might perpetuate inequalities.

Recently, a team of U.K. scientists found that almost all eye disease datasets come from patients in North America, Europe, and China, meaning eye disease-diagnosing algorithms are less certain to work well for racial groups from underrepresented countries. In another study, Stanford University researchers claimed that most of the U.S. data for studies involving medical uses of AI come from California, New York, and Massachusetts. A study of a UnitedHealth Group algorithm determined that it could underestimate by half the number of Black patients in need of greater care. Researchers from the University of Toronto, the Vector Institute, and MIT showed that widely used chest X-ray datasets encode racial, gender, and socioeconomic bias. And a growing body of work suggests that skin cancer-detecting algorithms tend to be less precise when used on Black patients, in part because AI models are trained mostly on images of light-skinned patients.

Bias isn’t an easy problem to solve, but the coauthors of one recent study recommend that health care practitioners apply “rigorous” fairness analyses prior to deployment as one solution. They also suggest that clear disclaimers about the dataset collection process and the potential resulting bias could improve assessments for clinical use.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Researchers find that labels in computer vision datasets poorly capture racial diversity

February 9, 2021   Big Data
 Researchers find that labels in computer vision datasets poorly capture racial diversity

Datasets are a primary driver of progress in computer vision, and many computer vision applications require datasets that include human faces. These datasets often have labels denoting racial identity, expressed as a category assigned to faces. But historically, little attention has been paid to the validity, construction, and stability of these categories. Race is an abstract, fuzzy notion, and highly consistent representations of a racial group across datasets could be indicative of stereotyping.

Northeastern University researchers sought to study these face labels in the context of racial categories and fair AI. In a paper, they argue that labels are unreliable as indicators of identity because some labels are more consistently defined than others, and because datasets appear to “systematically” encode stereotypes of racial categories.

Their timely research comes after Deborah Raji and coauthor Genevieve Fried published a pivotal study examining facial recognition datasets compiled over 43 years. They found that researchers, driven by the exploding data requirements of machine learning, gradually abandoned asking for people’s consent, leading them unintentionally include photos of minors, use racist and sexist labels, and have inconsistent quality and lighting

Racial labels are used in computer vision without definition or only with loose and nebulous definition, the coauthors observe from the datasets they analyzed (FairFace, BFW, RFW, and LAOFIW). There’s myriad systems of racial classifications and terminology, some of debatable coherence, with one dataset grouping together “people with ancestral origins in Sub-Saharan Africa, India, Bangladesh, Bhutan, among others.” Other datasets use labels that could be considered offensive, like “Mongoloid.”

Moreover, a number of computer vision datasets use the label “Indian/South Asian,” which the researchers point to as an example of the pitfalls of racial categories. If the “Indian” label refers only to the country of India, it’s arbitrary in the sense that the borders of India represent the partitioning of a colonial empire on political grounds. Indeed, racial labels largely correspond with geographic regions, including populations with a range of languages, cultures, separation in space and time, and phenotypes. Labels like “South Asian” should include populations in Northeast India, who might exhibit traits more common in East Asia, but ethnic groups span racial lines and labels can fractionalize them, placing some members in one racial category and others in a different category.

“The often employed, standard set of racial categories — e.g., ‘Asian,’ ‘Black,’ ‘White,’ ‘South Asian’ — is, at a glance, incapable of representing a substantial number of humans,” the coauthors wrote. “It obviously excludes indigenous peoples of the Americas, and it is unclear where the hundreds of millions of people who live in the Near East, Middle East, or North Africa should be placed. One can consider extending the number of racial categories used, but racial categories will always be incapable of expressing multiracial individuals, or racially ambiguous individuals. National origin or ethnic origin can be utilized, but the borders of countries are often the results of historical circumstance and don’t reflect differences in appearance, and many countries are not racially homogeneous.”

Equally problematically, the researchers found that faces in the datasets they analyzed were systematically the subject of racial disagreements among annotators. All datasets seemed to include and recognize a very specific type of person as Black — a stereotype — while having more expansive (and less consistent) definitions for other racial categories. Furthermore, the consistency of racial perception varied across ethnic groups, with Filipinos in one dataset being seen less consistently seen as Asian compared with Koreans, for example.

“It is possible to explain some of the results purely probabilistically – blonde hair is relatively uncommon outside of Northern Europe, so blond hair is a strong signal of being from Northern Europe, and thus, belonging to the White category. But If the datasets are biased towards images collected from individuals in the U.S., then East Africans may not be included in the datasets, which results in high disagreement on the racial label to assign to Ethiopians relative to the low disagreement on the Black racial category in general,” the coauthors explained.

These racial labeling biases could be reproduced and amplified if left unaddressed, the coauthors warn, taking take on validity with dangerous consequences when divorced from cultural context. Indeed, numerous studies — including the landmark Gender Shades work by Joy Buolamwini, Dr. Timnit Gebru, Dr. Helen Raynham, and Raji — and VentureBeat’s own analyses of public benchmark data have shown facial recognition algorithms are susceptible to various biases. One frequent confounder is technology and techniques that favor lighter skin, which include everything from sepia-tinged film to low-contrast digital cameras. These prejudices can be encoded in algorithms such that their performance on darker-skinned people falls short of that on those with lighter skin.

“A dataset can have equal amounts of individuals across racial categories, but exclude ethnicities or individuals who don’t fit into stereotypes,” they wrote. “It is tempting to believe fairness can be purely mathematical and independent of the categories used to construct groups, but measuring the fairness of systems in practice, or understanding the impact of computer vision in relation to the physical world, necessarily requires references to groups which exist in the real world, however loosely.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Researchers find that debiasing doesn’t eliminate racism from hate speech detection models

February 6, 2021   Big Data

Current AI hate speech and toxic language detection systems exhibit problematic and discriminatory behavior, research has shown. At the core of the issue are training data biases, which often arise during the dataset creation process. When trained on biased datasets, models acquire and exacerbate biases, for example flagging text by Black authors as more toxic than text by white authors.

Toxicity detection systems are employed by a range of online platforms including Facebook, Twitter, YouTube, and various publications. While one of the premiere providers of these systems, Alphabet-owned Jigsaw, claims it’s taken pains to remove bias from its models following a study showing it fared poorly on Black-authored speech, it’s unclear the extent to which this might be true of other AI-powered solutions.

To see whether current model debiasing approaches can mitigate biases in toxic language detection, researchers at the Allen Institute investigated techniques to address lexical and dialectal imbalances in datasets. Lexical biases associate toxicity with the presence of certain words, like profanities, while dialectal biases correlate toxicity with “markers” of language variants like African-American English (AAE).

 Researchers find that debiasing doesn’t eliminate racism from hate speech detection models

In the course of their work, the researchers looked at one debiasing method designed to tackle “predefined biases” (e.g., lexical and dialectal). They also explored a process that filters “easy” training examples with correlations that might mislead a hate speech detection model.

According to the researchers, both approaches face challenges in mitigating biases from a model trained on a biased dataset for toxic language detection. In their experiments, while filtering reduced bias in the data, models trained on filtered datasets still picked up lexical and dialectal biases. Even “debiased” models disproportionately flagged text in certain snippets as toxic. Perhaps more discouragingly, mitigating dialectal bias didn’t appear to change a model’s propensity to label text by Black authors as more toxic than white authors.

In the interest of thoroughness, the researchers embarked on a proof-of-concept study involving relabeling examples of supposedly toxic text whose translations from AAE to “white-aligned English” were deemed nontoxic. They used OpenAI’s GPT-3 to perform the translations and create a synthetic dataset — a dataset, they say, that resulted in a model less prone to dialectal and racial biases.

 Researchers find that debiasing doesn’t eliminate racism from hate speech detection models

“Overall, our findings indicate that debiasing a model already trained on biased toxic language data can be challenging,” wrote the researchers, who caution against deploying their proof-of-concept approach because of its limitations and ethical implications. “Translating” the language a Black person might use into the language a white person might use both robs the original language of its richness and makes potentially racist assumptions about both parties. Moreover, the researchers note that GPT-3 likely wasn’t exposed to many African American English varieties during training, making it ill-suited for this purpose.

“Our findings suggest that instead of solely relying on development of automatic debiasing for existing, imperfect datasets, future work focus primarily on the quality of the underlying data for hate speech detection, such as accounting for speaker identity and dialect,” the researchers wrote. “Indeed, such efforts could act as an important step towards making systems less discriminatory, and hence safe and usable.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

THEY CAN FIND THE GUY WHO BROKE A WINDOW BUT NOT A MURDERER?

January 20, 2021   Humor
blank THEY CAN FIND THE GUY WHO BROKE A WINDOW BUT NOT A MURDERER?

You just know the left is protecting its own in the murder of Ashli Babbitt:

A Kentucky man who is accused of breaking a window of the Capitol building moments before Ashli Babbitt was fatally shot during the insurrection earlier this month has been arrested.

Chad Barrett Jones, 42, of Coxs Creek, Kentucky, was arrested in Louisville on Saturday and charged with assault on a federal officer, destruction of government property, obstruction of justice, unlawful entry on restricted building or grounds, and violent entry and disorderly conduct on Capitol grounds, the FBI said in a news release.

According to an FBI charging affidavit, Jones broke a window near the House Speaker’s Lobby that Babbitt tried to climb through as she was fatally shot.

The affidavit cites video from the Washington Post, alleging that Jones can be seen striking a door to the lobby’s glass panels with what appeared to be a wooden flag pole.

Watch: Man standing next to Ashli Babbitt as she was shot and killed has been arrested

The crowd around the man can be heard shouting “Break it down” and “let’s f—— go!” as he struck the glass, the FBI said.

Seconds after the glass panel was broken, Babbitt, 35, was shot by a police officer as she tried to climb through it to enter the lobby.

Babbitt and four others died in the Capitol riot, which was carried out by supporters of President Donald Trump who stormed the building as Congress debated Electoral College votes from the 2020 election won by President-elect Joe Biden.

FBI Special Agent Javier Gonzalez said in the affidavit that a witness identified Jones through a tip to the FBI National Threat Operation Center.

The witness said Jones was a relative who had told him he traveled to Washington DC and had used a flag pole holding a flag supporting Trump to break the Capitol window.

Another person, who identified himself as a friend of Jones, told the FBI that Jones had called him after seeing himself on the news, and called himself an idiot, according to the affidavit.

Jones is scheduled to appear in court on January 19.

Let’s block ads! (Why?)

ANTZ-IN-PANTZ ……

Read More

Researchers find that even ‘fair’ hiring algorithms can be biased

December 6, 2020   Big Data
 Researchers find that even ‘fair’ hiring algorithms can be biased

When it comes to customer expectations, the pandemic has changed everything

Learn how to accelerate customer service, optimize costs, and improve self-service in a digital-first world.

Register here

Ranking algorithms are widely used on hiring platforms like LinkedIn, TaskRabbit, and Fiverr. Because they’re prone to biases, many of these platforms have taken steps to ensure they’re fair, balanced, and predictable. But according to a study from researchers affiliated with Harvard and Technische Universität Berlin, which examined the effect of “fair” ranking algorithms on gender, even ostensibly debiased ranking algorithms treat certain job candidates inconsistently.

The researchers specifically looked at the algorithms used on TaskRabbit, a marketplace that matches users with gigs like cleaning, moving, and delivery. As they note in a paper describing their work, TaskRabbit leverages ranking algorithms to sort through available workers and generate a ranked list of candidates suitable for a given task. Since it directly impacts livelihoods, if the underlying algorithms are biased, they could adversely affect underrepresented groups. The effects could be particularly acute in cities like San Francisco, where gig workers are more likely to be people of color and immigrants.

The Harvard coauthors studied how biases — specifically gender biases — percolate in TaskRabbit and impact real-world hiring decisions. They analyzed various sources of biases to do so, including the types of ranking algorithms, job contexts, and inherent biases of employers, all of which interact with each other.

The researchers conducted a survey of 1,079 people recruited through Amazon Mechanical Turk using real-world data from TaskRabbit. Each respondent served as a “proxy employer” required to select candidates to help them with three different tasks, namely shopping, event staffing, and moving assistance. To this end, recruits were shown a list of 10 ranked candidates for each task and asked to select the top 4 in each case. Then, they were given ranked lists generated by one of the three ranking algorithms — one that ranked candidates randomly (RandomRanking), one that ranked candidates based on their TaskRabbit scores (RabbitRanking), and a “fair” ranking algorithm (FairDet-Greedy) — or versions of the algorithms that swapped the genders of candidates from male to female and vice versa.

In their analysis, the researchers found that while fair ranking algorithms like FairDet-Greedy are helpful in boosting the number of underrepresented candidates hired, their effectiveness is limited by the job contexts in which employers have a preference for particular genders. The respondents were less likely to choose women for moving jobs compared with men, for example, and less likely to hire men for event staffing than women.

The researchers also report that they found fair ranking to be more effective when underrepresented candidates (e.g., women) are similar to those who are overrepresented (e.g., men). But they also found fair ranking to be ineffective at increasing representation when employers attempt to represent “demographic parity” — i.e., when they actively try but sometimes fail to make a diverse choice.

“Our study reveals that fair ranking can successfully increase the opportunities available to underrepresented candidates. However, we find that the effectiveness of fair ranking is inconsistent across job contexts and candidate features, suggesting that it may not be sufficient to increase representation outcomes in all settings,” the researchers wrote. “We hope that this work represents a step toward better understanding how algorithmic tools can (or cannot) reduce gender bias in hiring settings.”

Bias in hiring algorithms is nothing new — in a recent example, Amazon scrapped a recruiting engine that showed a clear bias against women. But it’s becoming more relevant in light of the fact that a growing list of companies, including Hilton and Goldman Sachs, are looking to automate portions of the hiring process. In fact, some 55% of U.S. human resources managers said AI would be a regular part of their work within the next five years, according to a 2017 survey by talent software firm CareerBuilder.

A Brookings Institution report advocated several approaches to reduce bias in algorithms used in hiring, including identifying a range of model inputs that can be predictive across a whole population and developing diverse datasets containing examples of successful candidates from a variety of backgrounds. But the report also noted that these steps can’t necessarily be taken by debiasing a model.

“Algorithmic hiring brings new promises, opportunities, and risks. Left unchecked, algorithms can perpetuate the same biases and discrimination present in existing hiring practices,” the Brookings report reads. “Existing legal protections against employment discrimination do apply when these algorithmic tools are used; however, algorithms raise a number of unaddressed policy questions that warrant further attention.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

November 11, 2020   Self-Service BI

The last few month I have used CDS a few times in my solutions – and connected the data to PowerBI.

But one of the things I always search for is the Server URL

111020 1022 howtoconnec1 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

Above is the screen shot using the Common Data Service connector

111020 1022 howtoconnec2 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

And even the beta connector requires me to specify URL –

111020 1022 howtoconnec3 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

And even though the dialog says “Environment domain” – it is in fact the URL the connector wants – BUT without the https://

OBS – If you want to test the Beta connector – remember to enable TDS Endpoint under the Environments – Settings – Features

111020 1022 howtoconnec4 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

So where can I find the address

If you have access to the Admin center – Power Platform admin center (microsoft.com) – you can go into the environment and see URL.

111020 1022 howtoconnec5 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

If you do not have access to it – then open the model driven app and the URL is available in the address bar.

111020 1022 howtoconnec6 How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

Hope this can help you.

Power On !

Let’s block ads! (Why?)

Erik Svensen – Blog about Power BI, Power Apps, Power Query

Read More
« Older posts
  • Recent Posts

    • The Easier Way For Banks To Handle Data Security While Working Remotely
    • 3 Ways Data Virtualization is Evolving to Meet Market Demands
    • Did you find everything you need today?
    • Missing Form Editor through command bar in Microsoft Dynamics 365
    • I’m So Excited
  • Categories

  • Archives

    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited