Category: Data science

  • “Designing a Conceptual Data Model for a Non-Profit Organization: A Case Study with Oakmont & Partners LP” Title: Initial Data Model for Refugee Intake and Assistance Program

    https://lucid.app/documents#/documents?folder_id=recent
    Motivation
    Start by reading the entire assignment and making a note in your calendar for the due date and the late submission window; set reminders. Block uninterrupted time slots of at least 45 minutes at a time to work on the assignment. Work on the assignment daily after working through the lessons.
    In the context of database design, the conceptual data model stands as a foundational pillar, playing a critical role in shaping the structure and future performance of the database. It serves as a blueprint, guiding developers and stakeholders through the complex landscape of data requirements and relationships. This model, often abstract and technology-agnostic, lays down the groundwork for more detailed and specific design stages, ensuring that the database aligns with the organizational goals and user needs.
    The importance of conceptual data modeling cannot be overstated; it acts as a communication tool that brings together different stakeholders, including business analysts, developers, and end-users, facilitating a common understanding of the data requirements. This collaborative approach ensures that the database is not only technically sound but also aligns with business objectives and user expectations. Furthermore, a well-designed conceptual model can significantly reduce complexity in later stages of database design, leading to more efficient implementation and easier maintenance. By addressing data inconsistencies and redundancies at an early stage, it also enhances data quality and integrity, which are crucial for reliable decision-making and operational efficiency. In essence, conceptual data modeling is not just a preliminary step; it is the strategic planning phase that determines the success and adaptability of the database system in a rapidly evolving data-driven world.
    The conceptual data model is often expressed visually in a UML Class Diagram — it is database-agnostic and can serve as the foundation for relational or non-relational database designs. On the other hand, the logical model is geared towards the implementation of the conceptual data model in a relational database. While UML Class Diagrams can be refined towards a relational implemenation, it is common to express relational designs in an entity-relational diagram (ERD). There are several common ERD notations in use today – the choice often depends on available tools. In this assignment, we will use the common Information Engineering (IE) notation, which is also often known as the “Crow’s Feet” notation because of the way the multiplicity indicators looks like the feet of crows (a kind of bird).
    Format
    While you may collaborate with others in the class you must submit your own model.
    Case Background
    Oakmont & Partners LP has been busy building data models for a number of new clients and Monica is excited to start a new modeling effort for a local non-profit that provides assistance to homeless and others in need. She is keen on applying her UML and ERD modeling skills that she acquired while taking a corporate training class taught by Ars Doceo. She knows that the first step in building a conceptual data model is to conduct requirements analysis. So, she sets up an interview with Kaileen Ormond, the Director of Social Pedagogy who has been with the organization for over 18 years. To make sure she remembers what is being said, she decides to record the interview. The following is a transcription of part of that recording:
    “… Let me give you a scenario. So, last month we took in 132 new homeless individuals and families. After we register them and determine their eligibility to shelters, we assign them to a bed in a common dormitory room in a shelter — if they are by themselves — or to a small family room if they are a family of five or fewer. If they are larger, then we generally cannot help them and we refer them to social services. Once we assign them to temporary housing (we need to record the date of the intake, as they can only stay a maximum of six months), we have to see if they can qualify for any government assistance programs such as WIC. Those programs are only accessible to US Citizens or permanent residents; they are not open to asylum seekers or anyone who is undocumented or is on a non-immigrant visa.”
    Monica thinks she has enough to develop an initial data model but is reassigned to a new project where her skills in agile business analysis are needed. So, you are being asked to jump in and build an initial data model. Your task is to develop a conceptual and then a logical data model for the entities and relationships within the context of the above requirements, along with a full definition of all entities. Be sure to list all your assumptions used in the construction of the data models.
    Problem 1 (60 Pts): Conceptual Data Model in UML
    Express the data model in a UML Class Diagram using LucidChartLinks to an external site.. Label the relationships where useful. Use directionality indicators on the labels (▲▼▶◀). Add key attributes as appropriate with the stereotype «key».
    To narrow and focus the scope of the model, consider only the specific requirements below. Those are the ones that the conceptual data model expressed in UML must support — you may omit any other considerations as this is clearly is very large project. The likely implementation will be a small application, perhaps a web app and this data model will help inform the database design and the user interface.
    track the names and key demographic information (birthday, country of birth, citizenship or visa status) of all individuals registered
    track immediate familial relationships, e.g., children (son, daughter), parents, grandparents
    track the shelter to which they are assigned and the type of housing (bed or family room), including address
    know when they were registered and when their permit to reside in-country expires
    track eligibility for government assistance: while you do not need to address this, how would you manage eligibility for different programs, e.g., a person might be eligible for one program but not for another
    legal representative(s) hired by or assigned to them
    If there are unresolved questions from the notes, post your question on Teams and incorporate the new findings into your model. You may discuss the problem and share insights with your peers in the class but you must build and submit your own model. Keep your model to about 6-8 classes/entities, although you may have more as long as it is warranted, but we do not want you to “overmodel”. Make reasonable assumptions when the requirements are fully specified, document your assumptions in your model, and then build your UML model accordingly.
    Problem 2 (40 Pts): Logical Data Model as ERD
    After you have built your conceptual data model as a UML Class Diagram, create (in a separate page/tab), an Entity-Relationship Digram (ERD) in the IE (Crow’s Feet) Notation of the same entities, i.e., “translate” the UML to and ERD. This may not always be done in practice, but we want you to practice using both notations.
    Time box your work to the allotted time of about 3 hours. If you spend substantially more than 3 hours then you are overthinking the problem. This is a large problem and we do not expect that you build a full domain data model — we want you to get started and pay only attention to the use cases. Of course, if you do want to explore the problem further, you may, but you will not get additiona points or “extra credit”.
    Submission Details
    Submit a public URL to your Conceptual Model as a UML Class Diagram and your Logical Model as an IE (Crow’s Feet) ERD, along with (in separate “tabs”) any notes and assumptions. You must use LucidChartLinks to an external site. to create the diagrams and track your notes in LucidChart.
    https://lucid.app/documents#/documents?folder_id=recent

  • “Exploring the Impact of Current and Emerging Trends on Organizations: An Ecological Perspective on Data Collection, Analysis, and Problem-Solving”

    1. (Critically) evaluate the impact of current and emerging trends on
    organizations. 
    2. Express mastery the ecological approach and field work data collection
    processes. 
    3. Articulate the need to collect data and manage data from an ecological
    perspective in order to solve problems in society. 
    4. Demonstrate an ability to effectively analyse, visualise problems and issues
    employing a range of appropriate concepts, theories and approaches relevant to
    human needs. 
    5. Establish and articulate the data quality process, where the problems come
    from and how they can be resolved. 
    6. Apply tools and techniques of strategic and operations analysis on how
    technology addresses human needs. 
    7. Develop succinct business reports.   

  • “Database Design and Implementation Report”

    Please check all the files.
    Guidance and Presentation
    For Part A students are expected to write up their answers as a report. The report should look
    professional and provide all the necessary information for Part A.
    • Maximum 15 pages including diagrams and images.
    • Additional diagrams and images should go into an appendix.
    For more information you should see the rubric provided at the end of this document.
    Submission Requirements
    • Students are required to submit a word/pdf document that contains answers to part A.
    • Students should also submit their Microsoft Access implementation for Part B.
    • These should all be submitted to the submission point on Course Resources before the
    deadline.

  • “Creating a Professional Resume: A Step-by-Step Guide”

    Please see attached instruction/template. If you have any questions, please do not hesitate to ask. Thank you!

  • “Exploring the Relationship between Data Science and Advanced Quantitative Methods in Psychology” Guideline: 1. Introduction – Briefly introduce the topic of data science and its relevance to psychology – Explain the importance of using advanced quantitative methods in psychological research

    it is Advanced Quantiteve Methods in Psychology, i couldn’t find this subject and data science is closest one to it. 
    i need to ask to go through guiedeline attached, check if you really understand what to do, i’ve done this before myself, i normally always do myself, but this time i failed, and it’s my last chance and i need it to be done perfectly. if needed i can also share comment on previous work, so you can see, what shouldn’t be done. all needed attachments below. 

  • “Mapping Misinformation: Analyzing the Dynamics and Impacts of False Information in Digital Media” Slide 1: Introduction – Thesis title and author’s name – Brief overview of the topic – Significance of the study Slide

    I want to edit and combine my works in the papers I have attached to create a 60-page thesis without references and make 60 presentation slides for the same content.
    I want it to be like this:
    Thesis Title: “Mapping Misinformation: Analyzing the Dynamics and Impacts of False Information in Digital Media”
    Chapter 1: Introduction
    Overview of the Thesis Topic
    Significance of the Study
    Objectives and Research Questions
    Structure of the Thesis
    Chapter 2: Literature Review
    Theoretical Framework on Misinformation and Its Effects
    Review of Previous Studies on Misinformation Dynamics
    Impact of Misinformation on Public Health, with Focus on COVID-19
    Include tables summarizing key studies or data (from your previous works).
    Chapter 3: Methodology
    Data Collection Methods (detailing how the data for each study was collected, including your previous works)
    Analytical Techniques (how you analyzed the data, consistent with the methods from your previous papers)
    Ethical Considerations
    Chapter 4: Case Study Analysis
    Geolocation Analysis of Misinformation (from “Mapping the Infodemic”)
    Topic Modeling of COVID-19 Misinformation (integration of findings from your various studies)
    Analysis of Misinformation Vacillation (from “SKM_Misinformation_Vacillation_Final”)
    Chapter 5: Results
    Synthesis of Findings Across Different Studies
    Comparative Analysis of Misinformation Trends and Their Implications
    Tables and Figures Illustrating Key Patterns and Trends
    Chapter 6: Discussion
    Interpretation of Results in the Context of Existing Literature
    Implications for Policy and Public Health
    Limitations of the Research
    Chapter 7: Conclusion and Recommendations
    Summary of Key Findings
    Recommendations for Future Research
    Strategies for Combating Misinformation
    References
    Comprehensive List of All Sources Cited
    Appendices
    Additional Data, Codebooks, or Methodological Details
    Also, I have attached the PowerPoint template 

  • Title: HW02 – Excel File Renaming and Submission

    Download the excel file from Assignments -> HW02. It is titled HW2.xlsx
    Change the title to userIDHW2.xlsx. Don’t change the sheet names from P1 through P4.
    Upload it back to the Assignments -> HW02 link before midnight the evening of Wednesday, May 15. As long as it is there before class begins on Thursday morning, you will get full credit. No credit is received if submitted late.
    I will post the solutions for your review.

  • “Mapping Misinformation: Analyzing and Combining Research on False Information in Digital Media” Chapter 1: Introduction – Overview of Misinformation in Digital Media – Significance of the Study – Objectives and Research Questions – Thesis Structure

    I want to edit and combine my work to create a 60-page thesis without references and make 60 presentation slides for the same content.
    I want it to be like this:
    Thesis Title: “Mapping Misinformation: Analyzing the Dynamics and Impacts of False Information in Digital Media”
    Chapter 1: Introduction
    Overview of the Thesis Topic
    Significance of the Study
    Objectives and Research Questions
    Structure of the Thesis
    Chapter 2: Literature Review
    Theoretical Framework on Misinformation and Its Effects
    Review of Previous Studies on Misinformation Dynamics
    Impact of Misinformation on Public Health, with Focus on COVID-19
    Include tables summarizing key studies or data (from your previous works).
    Chapter 3: Methodology
    Data Collection Methods (detailing how the data for each study was collected, including your previous works)
    Analytical Techniques (how you analyzed the data, consistent with the methods from your previous papers)
    Ethical Considerations
    Chapter 4: Case Study Analysis
    Geolocation Analysis of Misinformation (from “Mapping the Infodemic”)
    Topic Modeling of COVID-19 Misinformation (integration of findings from your various studies)
    Analysis of Misinformation Vacillation (from “SKM_Misinformation_Vacillation_Final”)
    Chapter 5: Results
    Synthesis of Findings Across Different Studies
    Comparative Analysis of Misinformation Trends and Their Implications
    Tables and Figures Illustrating Key Patterns and Trends
    Chapter 6: Discussion
    Interpretation of Results in the Context of Existing Literature
    Implications for Policy and Public Health
    Limitations of the Research
    Chapter 7: Conclusion and Recommendations
    Summary of Key Findings
    Recommendations for Future Research
    Strategies for Combating Misinformation
    References
    Comprehensive List of All Sources Cited
    Appendices
    Additional Data, Codebooks, or Methodological Details
    Also, I have attached the PowerPoint template 

  • “Optimizing Business Data Storage and Analytics: A Database Management Plan” Slide 1: Introduction – Title: Optimizing Business Data Storage and Analytics: A Database Management Plan – Speaker’s Name: [Your Name] – Company Name:

    You are the vice president of information technology at a small, growing business. You have been tasked with developing a plan for maintaining databases for storage of business data and use in business analytics.
    Using the work from Weeks 1–5, create a 20-minute presentation (10 to 12 slides) to explain your Database Management Plan. Ensure you:
    Provide an overview of how databases can be used in a company to store and extract information.
    Distinguish how organizational data can be used in the most effective way through developing a database.
    Compare how structured and unstructured data are used for data analytics, including concepts like the cloud and Hadoop.
    Evaluate and assist company decision makers in understanding the importance of database administration and data governance in relation to building scalable and robust applications.
    List the benefits of data administration compared to database administration.
    Propose an effective data governance program.
    Recommend how individual team roles can contribute to finding ways to build in ongoing monitoring; all roles have an interest in database quality and recovery.
    Summarize how your plan will assist the company in overall effectiveness, including the value of analytic results, such as data visualization and finding and applying patterns.
    Include videos, audio, photos, diagrams, or graphs as appropriate. Include substantial speaker notes or insert audio narration into your presentation. Explore the Microsoft® PowerPoint® website to locate instructions on recording audio for an executive audience. Your goal is to convince them that by implementing your Database Management Plan, the organization will be able to deliver effective, reliable data management support to meet business needs.

  • “Exploring the Predictive Power of Machine Learning for Sports Injury Prevention: A Comparative Analysis of Tree-Based Models and Feature Importance”

    Please finish the chapters 4 to 7 in total 12000 of the report, 
    Chapter 1-3 are already written
    No SPSS needed
    result in the need to use Python with tree base model analysis- random forest, XGboost, SVM to analysis
    -result and discussion with the following angle to write:
    Angles
    Machine Learning positioning: 2. Model prediction ability as a product? -> is machine learning capable to predict injury in the future? -> from binary classification(this project) to multi-class classification (e.g. predict the injury site instead of injury or not, predict the injury risk level e.g. green, yellow, red)
    Model findings on the data thru training and prediction process? -> feature importance -> whats insight from the importance, why the model treat some of the features more importanct than others after learning from the data
    Feature importance usage: 1.1. compare 3 models and using the result and feature importance ONLY from the best model? ->by what score we decide the model is the best? precision? recall? roc auc score? accuracy? different intepretation ->recall: we dont mind about false positive case? can we treat those who falsely predicted injury people as higher risk patients? in this case, can the model be used as 1st line of screening? but if ONLY focus recall, model may lose the basic ability of classify injury or not, e.g. recall with 1 can mean that model classify all patient as injury.
    1.2. Look into the result of 3 models and find out the common ground ->Compare the top 10 important features of 3 models and find out the common features appeared in all 3 ->Most common features means they all important despite the difference of 3 model algo, as different algo may have different apporach/bias to look into the data ->Are those common features align with your domain knowledge? e.g. Right hip – right knee alignment bla bla ->Find the connection between those commonly important features and your domain knowledge to create your story/justify your domain knowledge with these feature findings
    Data Bias: 2.1. Notice that right Hip and right knee usually higher importance than left ->data bias? the majority of data are right hander that their main leg is left? ->as main leg is left, right hip strength is weaker and right knee align is worse? ->As the majority of data input are “right” so the model is biased to right ->biased model breed biased result and biased feature important ->improvment: shd we not differentiate 2 sides? or shd we change the target of the model from (Yes vs No) to (left side injury vs right side injury vs no injury, which is multi-class classification mentioned above)